bV 7DON 1HWZRUNV C f C -...

17
Copyright, Seiya Imoto, Human Genome Center, University of Tokyo, 2009 X Y X, Y: Pr(X Y ): X Y Pr(X , Y ): X Y Pr(X): X P(X | Y ) X Y Pr(X ) = Pr(X , Y ) X Pr(X | Y ): X Y Pr(X | Y )= = Pr(X , Y ) Pr(Y | X )Pr(X) Pr(X | Y )= = Pr(Y) Pr(Y) Bayestheorem Bayes theorem X : Y :

Transcript of bV 7DON 1HWZRUNV C f C -...

Page 1: bV 7DON 1HWZRUNV C f C - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day4_SImoto.pdf6PRRWKLQJ 6SOLQHV)LWWLQJ E\ 0$3 HVWLPDWRU DUJPD[p(˝|X )'LIIHHQW DO HVRI'LIIHUHQW YDOXHV

Copyright, Seiya Imoto, Human Genome Center, University of Tokyo, 2009

��

X YX, Y:

Pr(X Y ): X YPr(X , Y ): X Y

Pr(X): X

P (X | Y ) X Y

Pr(X ) = Pr(X , Y )�X

Pr(X | Y ): X Y

Pr(X | Y ) = =Pr(X , Y ) Pr(Y | X )Pr(X)Pr(X | Y ) = =Pr(Y) Pr(Y)

Bayes’ theoremBayes theoremX : Y :

Page 2: bV 7DON 1HWZRUNV C f C - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day4_SImoto.pdf6PRRWKLQJ 6SOLQHV)LWWLQJ E\ 0$3 HVWLPDWRU DUJPD[p(˝|X )'LIIHHQW DO HVRI'LIIHUHQW YDOXHV

X YX, Y:

Pr(X , Y ) = Pr(X )Pr(Y )

Pr(X | Y ) = Pr(X )

X Y � def

X Y

Pr(X | Y ) = Pr(X )

Z ~ Bernoulli(1/2)Z = 0 X ~ N(0, 1) Y ~ N(0, 1)

X, Y, Z:

Z 0 X N(0, 1) Y N(0, 1)Z = 1 X ~ N(2, 1) Y ~ N(2, 1)

YY

XX

Y XZ ~ N(0, 1)X = Z + eX = Z + eXY = Z + eY

X Z eX ~ N(0, 1)eY ~ N(0, 1)

X YY Y|Z Z

Z X|Z

X, Y, Z:

Pr(X , Y | Z ) = Pr(X | Z )Pr(Y | Z )X Y | Z � def

Pr(X | Y, Z ) = Pr(X | Z)

Pr(X , Y |Z ) = Pr(X |Z )Pr(Y |Z )X Y | Z� def

Page 3: bV 7DON 1HWZRUNV C f C - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day4_SImoto.pdf6PRRWKLQJ 6SOLQHV)LWWLQJ E\ 0$3 HVWLPDWRU DUJPD[p(˝|X )'LIIHHQW DO HVRI'LIIHUHQW YDOXHV

ND(X) = X

X

XPa(X) =

XX X

X XX1,..., Xp

G X1,..., Xp

Pr(Xj | ND(Xj) ) = Pr(Xj | Pa(Xj) )

XXX

D F | E�D F | E�

B {E, F} | {C, D} �

X

Pr(X1,...,Xp) = Pr(Xj |Pa(Xj))�j =1

p

j =1

E FPr(A B C D E F)

C D

Pr(A, B, C, D, E, F)= Pr(A|B)Pr(B|C,D)Pr(C|E)Pr(D|E)Pr(E|F)Pr(F)

AB

Page 4: bV 7DON 1HWZRUNV C f C - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day4_SImoto.pdf6PRRWKLQJ 6SOLQHV)LWWLQJ E\ 0$3 HVWLPDWRU DUJPD[p(˝|X )'LIIHHQW DO HVRI'LIIHUHQW YDOXHV

jjj

j

j

x = s j

n

xj sjj

( j = 1,…, p)

1 n

nxxxX � 1211111 ~gene

nxxxX�������

� 2221222 ~gene

nppppp xxxX �

�������

21~gene

X4

X2 X14

X33

X = (xij)

G

Pr(X1,...,Xp) = Pr(Xj |Pa(Xj))�j =1

p

j

s(G | X )( | )

G

Pr(Xj | Pa(Xj)) fj (xij | paij , �j )

ipai3= xi1

xi3

pai3 xi1

f (xi1 ,…, xip | �G ) = � fj (xij | paij , �j )j =1

p

i = 1,…, n

�jf3

Page 5: bV 7DON 1HWZRUNV C f C - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day4_SImoto.pdf6PRRWKLQJ 6SOLQHV)LWWLQJ E\ 0$3 HVWLPDWRU DUJPD[p(˝|X )'LIIHHQW DO HVRI'LIIHUHQW YDOXHV

fj (xij | paij , �j )

),0(~, 2����� Nxy iiii �

�� �� 2)(1)|( ��� ii xyf

),(,�y iiii

��

���� 22 2

)(exp2

),|(��

��� ii

iiyxyf

jj pmpmx �� )()( )()(ijiqjqijij jj

pmpmx �� )(...)( 11

qjqj

),0(~ 2jij N �� jij

mjk (.)R RR R

�� jkM

lj

ikj

lklkj

ikjk pbpm1

)()()( )()( �� �l iklklkikjk pp1

)()( �

X Y ZX Z

Target – mY(Y) Target – mY(Y) – mZ(Z)

Target

Z

X X

of Y

fYan

dZ

Targ

et

t -ef

fect

effe

ct o

f

Targ

et

Targ

et -

X X X

Page 6: bV 7DON 1HWZRUNV C f C - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day4_SImoto.pdf6PRRWKLQJ 6SOLQHV)LWWLQJ E\ 0$3 HVWLPDWRU DUJPD[p(˝|X )'LIIHHQW DO HVRI'LIIHUHQW YDOXHV

),|()|,...,( 11 jijijjpjGipi xfxxf �p� ���

ijj

iqjqj

ijij jjpmpmx �� )(...)( )()(

11 ),0(~ 2jij N ��

�� jkM

lj

ikj

lklkj

ikjk pbpm1

)()()( )()( �� �l iklklkikjk pp1

)()( �

� � 2)()( jj

��

��

��

��� ���

�2

2,

)()(

2 2

})({exp

21);|(

j

lkj

ikj

lklkij

j

jijijj

pbxxf

���p

X G

)|(maxargˆ XGpGG

)()|()(

)()|()|( GpGXpXp

GpGXpXGp ��)(Xp

�� ��� dGpGXpGXp )|(),|()|( � ��� dGpGXpGXp )|(),|()|(

)(Gp

�� ��� dGpGXpGXp )|(),|()|(

),|( GXp �jj xmxmx �� )()( �

� ��

M

j kjjk xbxm1

)()( �jccaaj xmxmx � )()(

)|( Gp �� �� �� ��� 2

21 )2(exp)|( jjjp ������ � �� 21 )(p)|( jjjp ����

)|()|( GLGXp ��

)|(maxargˆ GL �� � )|ˆ()|(ˆ GLGXp ��

Page 7: bV 7DON 1HWZRUNV C f C - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day4_SImoto.pdf6PRRWKLQJ 6SOLQHV)LWWLQJ E\ 0$3 HVWLPDWRU DUJPD[p(˝|X )'LIIHHQW DO HVRI'LIIHUHQW YDOXHV

p(��| X )

��

� �)|(exp)|()|,...,(1

1 ������ � ����

dDnldpxxfn

iipi �

� �� �)(1)|ˆ(exp)ˆ(

)2( 12/1

2/1

��

� ��

nODnlJ

np

r

11where

�n

)dim(/)|()(

)|(log1)|,...,(log1)|(

21

1

�����

����

������

� ��

rDlJ

pn

xxfn

Dl

Ti

ipi

��

)dim(,/)|()( ����� ��� rDlJ ��

)|(maxargˆ Dl ���

���

BNRC(G) = �2 log p(G | X)p

^

= BNRCj�j =1

G = argmin BNRC(G)^

Page 8: bV 7DON 1HWZRUNV C f C - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day4_SImoto.pdf6PRRWKLQJ 6SOLQHV)LWWLQJ E\ 0$3 HVWLPDWRU DUJPD[p(˝|X )'LIIHHQW DO HVRI'LIIHUHQW YDOXHV

G = argmin BNRC(G)G argmin BNRC(G)G

Optimal networks can be found using O(p2p ) dynamicOptimal networks can be found using O(p2 ) dynamicprogramming steps.

Networks around 25 nodes can be optimized in a dayby using 200 G Flops computer.

O(p2p)p

O(p2 )

O(p3)

2O(mp2)

O( 2max{m })iO(p2max{m })i

An undirected graph covers skeleton of the true graph.In practice we estimate a super-structure by an independence test with relaxed significance leveltest with relaxed significance level.

T l i l l i i i O(p2m+|Con(S)|)Total computational complexity is in O(p2 +|Con(S)|).S is the super-structure we imposed as the constraint or estimated from data.

i th i d f S 1m is the maximum degree of S.|Con(S)| < O(�m

p), where �m = (2m+1�1)In practice, 1.5p ~ 1.9p depending on m.

m+11

Networks with 50 nodes can be optimized in a few hoursby using a laptop computer, if the average degree is around 2.

Page 9: bV 7DON 1HWZRUNV C f C - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day4_SImoto.pdf6PRRWKLQJ 6SOLQHV)LWWLQJ E\ 0$3 HVWLPDWRU DUJPD[p(˝|X )'LIIHHQW DO HVRI'LIIHUHQW YDOXHV

rm

. ser

ver

Sun Blade X6250 Sun Fire X4440 AMD Q d C O t 8356D

ist.

mem

Intel Quad Core Xeon E5450 (3.0GHz) X2724 nodes (4X2X724 = 5760 Cores)Mem 32 Gbyte/node (~23 Tbyte in total)

AMD Quad Core Opteron 8356(2.3GHz) X4

12 nodes (4X4X12 = 192 Cores)Mem 128 Gbyte/node (~1.5 Tbyte in total)

D

Mem 128 Gbyte/node ( 1.5 Tbyte in total)

. ser

ver

About 75 T FLOPS in peak

ed m

em.

SGI Altix 4700 Intel Dual Core Itanium 9150M (1.66GHz) X64

(2*64 = 128 Cores)

Shar

e

Mem 2 Tbyte

~ 225 Tera Flops(2011 ~)

X = (xij)

G

Pr(X1,...,Xp) = Pr(Xj |Pa(Xj))�j =1

p

j

s(G | X )( | )

G

��

Page 10: bV 7DON 1HWZRUNV C f C - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day4_SImoto.pdf6PRRWKLQJ 6SOLQHV)LWWLQJ E\ 0$3 HVWLPDWRU DUJPD[p(˝|X )'LIIHHQW DO HVRI'LIIHUHQW YDOXHV

Human umbilical vein endothelial cell (HUVEC)Fenofibrate: agonist of PPAR�, drug for hyperlipemia

Time-course data (triplicate)

CodeLink Human Uniset I 20K (20,469 probes)TM

Time-course data (triplicate)- HUVEC treated with 25�M fenofibrate- 0 (control), 2, 4, 6, 8 and 18 hours (6 time-points)

Knock-down data- 400 KD (by siRNA) arrays (in 2006 we use 270 KDs)400 KD (by siRNA) arrays (in 2006 we use 270 KDs)- Most knock-down genes are transcription factor

1192 id tifi d th f fib t l t d1192 genes are identified as the fenofibrate-related genes(The details are described in Imoto et al. (2006) PSB)

E FCont siRNACont siRNA

By the microarray (D k

C DsiRNA for D

Cont siRNACont siRNA

AB

Cont siRNA

Cont siRNACont siRNA

Imoto, S., Tamada, Y., Araki, H., Yasuda, K., Print, C.G., D. Charnock-Jones, S., Sanders, D., Savoie, C.J., Tashiro, K., Kuhara, S., Miyano, S. (2006) Computational strategy for discovering druggable gene networks from genome-wide RNA expression profiles. Pacific Symposium on Biocomputing, 11, 559–571.

Page 11: bV 7DON 1HWZRUNV C f C - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day4_SImoto.pdf6PRRWKLQJ 6SOLQHV)LWWLQJ E\ 0$3 HVWLPDWRU DUJPD[p(˝|X )'LIIHHQW DO HVRI'LIIHUHQW YDOXHV

PPAR�

Network of 1192 fenofibrate-inducedgenes

Downstream pathway of PPAR�

Focus on lipid PPAR�metabolism genes

PPARaperoxisome proliferative activated receptor alpha Fatty acid synthesisactivated receptor, alphaFatty acid beta-oxidation Fatty acid synthesis

RARGretinoic acid receptor,

gamma

ITPR3inositol 1,4,5-triphosphate

receptor, type 3

EHHADHDCI Kassam et al.

enoyl-Coenzyme A, hydratase/3-hydroxyacyl

Coenzyme A dehydrogenase

DCIdodecenoyl-Coenzyme A

delta isomerase (2000) J. Biol. Chem.

SREBF1sterol regulatory element

binding transcription factor 1

IL4interleukin 4

binding transcription factor 1

LDLRHSD17B4Cholesterol metabolism

Knight et al. (2005) Biochemical. J.

LDLRlow density

lipoprotein receptor

HSD17B4hydroxysteroid (17-beta)

dehydrogenase 4 Fan et al. (1998) J. Endocrino. Bernal-Mizrachi et al. (2003) Nat. Med.

HMGCR: Sankyo

LIPG

HMGCR: Sankyo

LSS: RocheLSS: Roche

AKR1C3 COX2

PPARa: Fenofibrate

Druggable: Nat. Rev. Drug Discov. 1:727-30, 2002

Page 12: bV 7DON 1HWZRUNV C f C - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day4_SImoto.pdf6PRRWKLQJ 6SOLQHV)LWWLQJ E\ 0$3 HVWLPDWRU DUJPD[p(˝|X )'LIIHHQW DO HVRI'LIIHUHQW YDOXHV

��

GENE X Down stream of GENE X(pick up lipid metabolism genes)

C14orf1(ERG28)

ACAT2TCEA2

LSS

ACAT2

LSSPTGS1

IDI1HMGCS1

BMP4

PTGS2

HMGCR

FDFT1

DHCR24

FDFT1

ACAS2

•Cholesterol synthesis genes•Lipid metabolism genes

Copyright (C) Seiya Imoto, Human Genome Center, University of Tokyo

COX 2 inhibitors cause heart diseaseCOX-2 inhibitors cause heart diseaseand cerebral stroke!!

Various COX-2 inhibitors� Vioxx@Merck@� Celebrex@Pfizer� Bextra@Pfizer

Copyright (C) Seiya Imoto, Human Genome Center, University of Tokyo

��

Page 13: bV 7DON 1HWZRUNV C f C - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day4_SImoto.pdf6PRRWKLQJ 6SOLQHV)LWWLQJ E\ 0$3 HVWLPDWRU DUJPD[p(˝|X )'LIIHHQW DO HVRI'LIIHUHQW YDOXHV

Araki, H., Tamada, Y., Imoto, S., Dunmore, B., Sanders, D., Humphrey, S., Nagasaki, M., Doi, A., Nakanishi, Y., Yasuda, K., Tomiyasu, Y., Tashiro, K., Print, C., Charnock-Jones, D. S., Kuhara, S., Miyano, S. (2009). Analysis of PPAR alpha-dependent and PPAR alphaindependent transcript regulation following fenofibrate treatment of human endothelial cells, Angiogenesis, in press.

Gene 1 Gene 2 Gene k

pres

sion

Control Drug Control Drug Control DrugExp

Gene 1

essi

on Gene 2 Gene k

Early response Middle response

Expr

e

Time Time TimeLate response y p pp

Dr g

Side Effect

Drug Drug Efficacy

D = {x1,...,xt ,...,xT } :Time Course Microarray Data

xt = (x1(t),...,xp(t)): Expression Data at Time t

Markov Property Between TimeMarkov Property Between Time

x1 x2 xTxT�1...

Gene Network: Bipartite Graph

p(x1,...,xt ,...,xT ) = p(x1) p(x2| x1) p(xT | xT�1)...

Gene Network: Bipartite Graph

...x1(1) x1(2) x1(T�1) x1(T)

(T)( 1)(2)(1)

... ... ... ......

...

x2(T)

x (T)

x2(T�1)

x (T�1)

x2(2)

x (2)

x2(1)

x (1) ... xp(T)xp(T�1)xp(2)xp(1)

p(xt | xt�1) = �j p(xj(t) | paj(t �1), �j )

Estimated Bipartite GraphCorresponding Gene Network

p p

x1(t�1) x1(t) Self-loop

x2(t)

x3(t)

x2(t�1)

x3(t�1)

x1

x3(t)x3(t 1)

x4(t)x4(t�1)

x2 x3Feedback

x5(t�1) x5(t) x4 x5

regulation

Page 14: bV 7DON 1HWZRUNV C f C - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day4_SImoto.pdf6PRRWKLQJ 6SOLQHV)LWWLQJ E\ 0$3 HVWLPDWRU DUJPD[p(˝|X )'LIIHHQW DO HVRI'LIIHUHQW YDOXHV

p(xt | xt�1) = �j p(xj(t) | paj(t �1), �j )

xj(t) = mj( paj(t �1)) + �j(t)

mj( paj(t�1)) = mj1(paj1(t�1)) + mj2 (paj2(t�1)) + ...

�mjk (x) = �b�(x)

��= 1

Kim, Imoto, Miyano (2004) Biosystems, 75, 57-65.

gene 1 gene 2 gene k

A1 A2 A4

A2N1 = A1 N2 = A1 N4 = A3 A4U UNodeset

G1= (N1, E1) G2 = (N2, E2) G4 = (N4, E4)

Kim, Imoto, Miyano (2004) Biosystems, 75(1-3), 57–65. Tamada et al. (2009) Pac. Symp. Biocomput., 14, 251–263.

G G GG G1NG

2NG4NG

3NG5NG G G GG G

1NG2NG

4NG3NG

5NG

Page 15: bV 7DON 1HWZRUNV C f C - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day4_SImoto.pdf6PRRWKLQJ 6SOLQHV)LWWLQJ E\ 0$3 HVWLPDWRU DUJPD[p(˝|X )'LIIHHQW DO HVRI'LIIHUHQW YDOXHV

G G GG G1NG

2NG4NG

3NG5NG G G GG G

1NG2NG

4NG3NG

5NG

G G GG G1NG

2NG4NG

3NG5NG

Background

Fenofibrate is a synthetic ligand for PPARa. However there are reports that fenofibrate affects endothelial cellsHowever, there are reports that fenofibrate affects endothelial cellsin a PPARa-independent manner.

Aim and Method

Using siRNA for PPARa (not included in 400KDs),g ( ),we separate fenofibrate-regulated genes into PPARa-dependent or PPARa-independent

Page 16: bV 7DON 1HWZRUNV C f C - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day4_SImoto.pdf6PRRWKLQJ 6SOLQHV)LWWLQJ E\ 0$3 HVWLPDWRU DUJPD[p(˝|X )'LIIHHQW DO HVRI'LIIHUHQW YDOXHV

Thi iThis gene isPPARa-dependentlyregulated byregulated byfenofibrate

(1.7-fold & q < 0.05)( q )

(1.7-fold)

Fenofibrate Regulated GenesFenofibrate Regulated Genes

PPARa-dependently Regulated Genes by Fenofibrate

666 167 = 499666 – 167 = 499

499/666 ~ about 74.9 % PPARa-Independent

GDF15 i hibit d th li l ll i ti d d t i t ll tidGDF15 inhibits endothelial cell migration and decreases matrix metallopeptidase2 (MMP2) activity produced by the HUVECs in a concentration-dependent manner. These effects are very similar to fenofibrate’s effects

Page 17: bV 7DON 1HWZRUNV C f C - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day4_SImoto.pdf6PRRWKLQJ 6SOLQHV)LWWLQJ E\ 0$3 HVWLPDWRU DUJPD[p(˝|X )'LIIHHQW DO HVRI'LIIHUHQW YDOXHV

University of Tokyo University of AucklandUniversity of TokyoProf. Satoru MiyanoDr. Yoshinori Tamada

University of AucklandProf. Cristin Print

Dr. Masao Nagasaki

Kyushu University

University of CambridgeProf. D. Stephen Charnock-JonesDr Ben Dunmore

Prof. Satoru KuharaProf. Kousuke Tashiro

Dr. Ben DunmoreDr. Deborah SandersDr. Sally Humphreys

Cell Innovator Inc.Dr. Hiromitsu ArakiDr Atsushi DoiDr. Atsushi DoiDr. Kaori Yasuda

Yukiko NakanishiYuki TomiyasuYuki Tomiyasu