A modified graphical gaussian model approach for genetic...
Transcript of A modified graphical gaussian model approach for genetic...
![Page 1: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/1.jpg)
A modified graphical gaussian modelapproach for genetic regulatory
networks
Anja Wille, SfSReverse Engineering Project, ETHZ
![Page 2: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/2.jpg)
biological scenarios
Introduction
modified GGMs
graphical gaussian models(GGMs)
![Page 3: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/3.jpg)
Biological scenarios
DXPS2DXPS1DXPS3 DXPSx
DXR
MCT
CMK
MECPS
IPPI1
GGPPS5 GGPPS6 GGPPS2GGPPS10 GGPPS11
HMGS
AACT2
HMGR2HMGR1
MK
MPDC2
IPPI2
FPPS2
GPPS(P) GPPS(Q)
FPPS1
GPPS
AACT1
Mitochondria
CytosolChloroplast
GGPPS3 GGPPS4GGPPS7 GGPPS1GGPPS9 GGPPS12GGPPS8
GPPR
CarotenoidsChlorophylls
PS
PhytosterolsSesquiterpenes
DPPS
Isoprenoid pathways
![Page 4: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/4.jpg)
Genetic regulation
39 g
enes
118 observations
… signals …
graphical model to estimateconditional dependencebetween genes
![Page 5: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/5.jpg)
Graphical Gaussian Models
• undirected
• random variables follow a multivariate normal distribution
• log likelihood:
for a sample of N observations y1,…,yN withsample mean , sample covariance matrix Sand precision matrix
y-1S=W
†
l(m,W) = -N2
(qln(2p ) + ln | S | +tr(WS) + (y - m)'W(y - m))
• partial correlation coefficient wij|rest for gene pair ij
![Page 6: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/6.jpg)
Bootstrapping and pairwise correlation
![Page 7: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/7.jpg)
Application to isoprenoid pathways
DXPS1DXPS3
DXR
MCT
CMK
MECPS
IPPI1
GGPPS5 GGPPS6 GGPPS2GGPPS10
HMGS
AACT2
HMGR1
MK
MPDC2
FPPS2FPPS1
GPPS
AACT1
Mitochondria
CytosolChloroplast
GGPPS3 GGPPS4GGPPS7 GGPPS1GGPPS9 GGPPS12GGPPS8
GPPR
CarotenoidsChlorophylls
PS
PhytosterolsSesquiterpenes
Isoprenoid pathways
HMGR2
DXPSx
GPPS(P) GPPS(Q) DPPS
DXPS2
IPPI2
GGPPS11
![Page 8: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/8.jpg)
Problems
• matrix inversion rank-sensitive
• genes ↑ fi exponentially increasing number of models
fi difficult to interpret
• how to interpret high partial correlation accompanied with low pairwise correlation
• efficient procedure for attaching new genes needed
![Page 9: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/9.jpg)
Outline for modified GGM approach
For each pair of genes i,j:
• take pairwise correlation into account
• fit GGMs with gene triples i,j,k for all remaining genes k to study the partial correlation
• combine GGMs and pairwise correlation for inference on edge ij
• 3 methods that differ with respect to statistical framework and computational costs
![Page 10: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/10.jpg)
Frequentist approach
j
i
Focus on genepair ij• pij|k is p-value from deviance test wij|k ≠ 0 versus wij|k = 0
k
j
i
k
![Page 11: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/11.jpg)
Frequentist approach (cont’d)
j
i
1) Form pij,max = max (pij|k for all genes k ≠ i,j)2) Adjust pij,max according to Bonferroni-Holm or FDR3) If the adjusted value pij,max <0.05, draw edge between i and j
k
j
i
k
![Page 12: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/12.jpg)
Application to isoprenoid pathway
DXPS2DXPS1DXPS3 DXPSx
DXR
MCT
CMK
MECPS
GGPPS5 GGPPS6 GGPPS2GGPPS10
HMGS
AACT2
HMGR2HMGR1
MK
MPDC2
IPPI2
FPPS1
GPPS
AACT1
Mitochondria
CytosolChloroplast
GGPPS3 GGPPS4GGPPS7 GGPPS1GGPPS9 GGPPS12GGPPS8
GPPR
CarotenoidsChlorophylls
PS
PhytosterolsSesquiterpenes
Isoprenoid pathways
IPPI1
GGPPS11
FPPS2
GPPS(P) DPPSGPPS(Q)
![Page 13: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/13.jpg)
Likelihood approach with parameters q
Estimate q = {q ij for all i,j} in a maximum likelihood approach
j
i
kparameterqij foredge ij
†
L(q) = L(q | g)P(g)gŒGÂ
![Page 14: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/14.jpg)
EM algorithm
j
i
kparameterqij foredge ij
†
Q(q |q t ) = L(q | g)P(g |q t ,y)gŒGÂ to be maximized
qik
qjk
†
qijt +1 =
qikt gik (1-qik
t )1-gik q jkt gjk (1-q jk
t )1-gjk ⋅ L(mg,Wg)g|gij =1Â
qikt gik (1-qik
t )1-gik q jkt gjk (1-q jk
t )1-gjk ⋅ L(mg,Wg)gÂk≠ i,j
’
![Page 15: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/15.jpg)
Simplification
†
qijt +1 = qik
t q jkt ⋅ P(w ij|k > 0 | y) + (1-qik
t q jkt ) ⋅ P(s ij > 0 | y)
k≠ i,j’
†
qijt +1 =
qikt gik (1-qik
t )1-gik q jkt gjk (1-q jk
t )1-gjk ⋅ L(mg,Wg)g|gij =1Â
qikt gik (1-qik
t )1-gik q jkt gjk (1-q jk
t )1-gjk ⋅ L(mg,Wg)gÂk≠ i,j
’
j
i
kqij
qik
qjk
not all GGMswith i,j,kconsidered
![Page 16: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/16.jpg)
Distribution of qij
![Page 17: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/17.jpg)
Conclusions
• GGM of gene triples used to look whether correlation between two genes can be “explained” by a third one
• frequentist approach simple, can be applied to many genes
• approach with q-parameters requires iteration, tested for up to70 genes
• a large set of additional genes can be attached to constructednetwork
![Page 18: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/18.jpg)
Modeling at two levels
j
i
k
attach additional genes• possibly 1000s• which one “explain” edges?
genetic network• small number <100• model edges
![Page 19: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/19.jpg)
Attaching additional genes
For additional genes k• include in computation of qij but keep qik and qjk fixed• qij decreases• count how often k decreases qij, validate
j
i
k
qjk fixed
![Page 20: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/20.jpg)
Attaching genes from other pathways
without additional genes with additional genes
![Page 21: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/21.jpg)
Attaching genes from other pathways
both pathways
carotenoid*
tocopherol*
calvin cyclechlorophyll*
abscisic acid*
onecarbonpoolphytosterol*
chloroplast
carotenoid*
onecarbonpoolchlorophyll*
calvin cycle
cytoplasm
tocopherol*
phytosterol*
downstream pathways marked by *
![Page 22: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/22.jpg)
Attaching genes from other pathways
![Page 23: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/23.jpg)
Hierarchical clustering
![Page 24: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/24.jpg)
Conclusions
Modified GGMs
• model dependence between genes
• combine GGMs and pairwise correlation for inference on edge ij
• different statistical design, computational cost
• additional genes can be fitted into the model
• similarities in expression patterns between groups of genescan be identified (also verified in yeast data)
![Page 25: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/25.jpg)
Comparison of different methods
pairwise correlation
GGM
0.68
0.96
0.60
0.800.81
0.56
0.49
0.470.43
Modified GGM1
Modified GGM2
Modified GGM3
![Page 26: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/26.jpg)
Consistent results
DXPS2DXPS1DXPS3 DXPSx
DXR
MCT
CMK
MECPS
IPPI1
GGPPS5 GGPPS6 GGPPS2GGPPS10 GGPPS11
HMGS
AACT2
HMGR2HMGR1
MK
MPDC2
IPPI2
FPPS2
GPPS(P)
FPPS1
GPPS
AACT1
Mitochondria
CytosolChloroplast
GGPPS3 GGPPS4GGPPS7 GGPPS1GGPPS9 GGPPS12GGPPS8
GPPR
CarotenoidsChlorophylls
PS
PhytosterolsSesquiterpenes
GPPS(Q) DPPS
![Page 27: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/27.jpg)
Acknowledgments
• Peter Buehlmann
• Stefan Bleuler, Amela Prelic, Eckhard Zitzler
• Philip Zimmermann, Lars Hennig, Willi Gruissem
![Page 28: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/28.jpg)
Galactose pathway in yeast
![Page 29: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/29.jpg)
Galactose pathway in yeast
![Page 30: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/30.jpg)
Network for galactose pathway
![Page 31: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/31.jpg)
Network for galactose pathway
Genes that attached to network
GCY1FAR1YDR010CYEL057CYPL066WYOR121CMLF3
YLL058WYJL212CPCL10LYS1YBR139WYCR059C
![Page 32: A modified graphical gaussian model approach for genetic …compdiag.molgen.mpg.de/docs/talk_12_01_04_wille.pdf · MLF3 YLL058W YJL212C PCL10 LYS1 YBR139W YCR059C † L(q)= L(q|g)P(g)](https://reader033.fdocuments.net/reader033/viewer/2022051905/5ff72cc1e32601375708c2b8/html5/thumbnails/32.jpg)
†
L(q) = L(q | g)P(g)gŒGÂ
†
Q(q |q t ) = L(q | g)P(g |q t ,y)gŒGÂ
†
qijt +1 = qik
t q jkt ⋅ P(w ij|k > 0 | y) + (1-qik
t q jkt ) ⋅ P(s ij > 0 | y)
k≠ i,j’
†
qijt +1 =
qikt gik (1-qik
t )1-gik q jkt gjk (1-q jk
t )1-gjk ⋅ L(mg,Wg)g|gij =1Â
qikt gik (1-qik
t )1-gik q jkt gjk (1-q jk
t )1-gjk ⋅ L(mg,Wg)gÂk≠ i,j
’
†
l(m,W) = -N2
(qln(2p ) + ln | S | +tr(WS) + (y - m)'W(y - m))