Inferring networks from multiple samples with consensus LASSO

Post on 24-May-2015

120 views 1 download

Tags:

description

Séminaire de statistique de Montpellier October, 6th, 2014 Montpellier, France

Transcript of Inferring networks from multiple samples with consensus LASSO

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Joint network inference with the consensualLASSO

Nathalie Villa-VialaneixJoint work with Matthieu Vignes, Nathalie Viguerie

and Magali San Cristobal

Séminaire de Statistique de Montpellier6 octobre 2014

http://www.nathalievilla.org

nathalie.villa@toulouse.inra.fr

Nathalie Villa-Vialaneix | Consensus Lasso 1/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Outline

1 Short overview on biological background

2 Network inference and GGM

3 Inference with multiple samples

4 Simulations

Nathalie Villa-Vialaneix | Consensus Lasso 2/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

DNA

DNA (DeoxyriboNucleic Acid (DNA)

molecule that encodes the genetic instructions used in the developmentand functioning of all known living organisms and many virusesdouble helix made with only four nucleotides (Adenine, Cytosine,Thymine, Guanine): A binds with T and C with G.

Nathalie Villa-Vialaneix | Consensus Lasso 3/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Transcription

Transcription of DNA

transcription is the process by which a particular segment of DNA (calledgene) is copied into a single-strand RNA (which is called message RNA)

A is transcripted into U, T into A, Cinto G and G into C

Nathalie Villa-Vialaneix | Consensus Lasso 4/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

What is mRNA used for?

mRNA then moves out of the cell nucleus and is translated into proteinswhich are made of 20 different amino-acids (using an alphabet: 3 lettersof mRNA→ 1 amino-acid)

Proteins are used by the cell to function.

Nathalie Villa-Vialaneix | Consensus Lasso 5/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Gene expression

In a given cell, at a given time, all the genes are not translated or nottranslated at the same level.

Gene expression

is a process by which a given gene is transcripted or/and traductedIt depends on the type of cell, the environment, ...

Gene expression can be measured either by:

“counting” the number of copies (mRNA) of a given gene in the cell:transcriptomic data;

“counting” the quantity of a given protein in the cell: proteomic data.

Even though a given mRNA is translated into a unique protein, there isno simple relationship between transcriptomic and proteomic data.

Nathalie Villa-Vialaneix | Consensus Lasso 6/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Gene expression

In a given cell, at a given time, all the genes are not translated or nottranslated at the same level.

Gene expression

is a process by which a given gene is transcripted or/and traductedIt depends on the type of cell, the environment, ...

Gene expression can be measured either by:

“counting” the number of copies (mRNA) of a given gene in the cell:transcriptomic data;

“counting” the quantity of a given protein in the cell: proteomic data.

Even though a given mRNA is translated into a unique protein, there isno simple relationship between transcriptomic and proteomic data.

Nathalie Villa-Vialaneix | Consensus Lasso 6/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Transcriptomic data

How are transcriptomic data obtained?

DNA spots associated to targetgenes (probes) are attached to asolid surface (array)

When the binding between mRNAand the probes is good, the spotbecomes fluorescent.

Expression of a given gene is quantified by the intensity of thefluorescent signal (read by a scanner).

Nathalie Villa-Vialaneix | Consensus Lasso 7/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Transcriptomic data

How are transcriptomic data obtained?

RNA material extracting for cells ofinterest (blood, lipid tissue, muscle,urine...) are labeled fluorescentlyand applied on the array

When the binding between mRNAand the probes is good, the spotbecomes fluorescent.

Expression of a given gene is quantified by the intensity of thefluorescent signal (read by a scanner).

Nathalie Villa-Vialaneix | Consensus Lasso 7/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Typical microarray data

Data: large scale gene expression data

individualsn ' 30/50

X =

. . . . . .

. . X ji . . .

. . . . . .

︸ ︷︷ ︸variables (genes expression), p'103/4

Typical design: two (or more) conditions (treated/control for instance)More and more complicated designs: crossed conditions (treated/controland different breeds), longitudinal data...

Typical issues: find differentially expressed genes (i.e., genes whoseexpression is significantly different between the conditions)

Nathalie Villa-Vialaneix | Consensus Lasso 8/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Typical microarray data

Data: large scale gene expression data

individualsn ' 30/50

X =

. . . . . .

. . X ji . . .

. . . . . .

︸ ︷︷ ︸variables (genes expression), p'103/4

Typical design: two (or more) conditions (treated/control for instance)More and more complicated designs: crossed conditions (treated/controland different breeds), longitudinal data...

Typical issues: find differentially expressed genes (i.e., genes whoseexpression is significantly different between the conditions)

Nathalie Villa-Vialaneix | Consensus Lasso 8/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Typical microarray data

Data: large scale gene expression data

individualsn ' 30/50

X =

. . . . . .

. . X ji . . .

. . . . . .

︸ ︷︷ ︸variables (genes expression), p'103/4

Typical design: two (or more) conditions (treated/control for instance)More and more complicated designs: crossed conditions (treated/controland different breeds), longitudinal data...

Typical issues: find differentially expressed genes (i.e., genes whoseexpression is significantly different between the conditions)

Nathalie Villa-Vialaneix | Consensus Lasso 8/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Outline

1 Short overview on biological background

2 Network inference and GGM

3 Inference with multiple samples

4 Simulations

Nathalie Villa-Vialaneix | Consensus Lasso 9/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Systems biology

Instead of being used to produce proteins, some genes’ expressionsactivate or repress other genes’ expressions⇒ understanding the wholecascade helps to comprehend the global functioning of living organisms1

1Picture taken from: Abdollahi A et al., PNAS 2007, 104:12890-12895. c© 2007 byNational Academy of Sciences

Nathalie Villa-Vialaneix | Consensus Lasso 10/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Model framework

Data: large scale gene expression data

individualsn ' 30/50

X =

. . . . . .

. . X ji . . .

. . . . . .

︸ ︷︷ ︸variables (genes expression), p'103/4

What we want to obtain: a graph/network with

nodes: (selected) genes;

edges: strong links between gene expressions.

Nathalie Villa-Vialaneix | Consensus Lasso 11/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Advantages of network inference

1 over raw data: focuses on the strongest direct relationships:irrelevant or indirect relations are removed (more robust) and thedata are easier to visualize and understand (track transcriptionrelations).

Expression data are analyzed all together and not by pairs (systemsmodel).

2 over bibliographic network: can handle interactions with yetunknown (not annotated) genes and deal with data collected in aparticular condition.

Nathalie Villa-Vialaneix | Consensus Lasso 12/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Advantages of network inference

1 over raw data: focuses on the strongest direct relationships:irrelevant or indirect relations are removed (more robust) and thedata are easier to visualize and understand (track transcriptionrelations).Expression data are analyzed all together and not by pairs (systemsmodel).

2 over bibliographic network: can handle interactions with yetunknown (not annotated) genes and deal with data collected in aparticular condition.

Nathalie Villa-Vialaneix | Consensus Lasso 12/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Advantages of network inference

1 over raw data: focuses on the strongest direct relationships:irrelevant or indirect relations are removed (more robust) and thedata are easier to visualize and understand (track transcriptionrelations).Expression data are analyzed all together and not by pairs (systemsmodel).

2 over bibliographic network: can handle interactions with yetunknown (not annotated) genes and deal with data collected in aparticular condition.

Nathalie Villa-Vialaneix | Consensus Lasso 12/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Using correlationsRelevance network [Butte and Kohane, 1999]

First (naive) approach: calculate correlations between expressions for allpairs of genes, threshold the smallest ones and build the network.

Correlations Thresholding Graph

Nathalie Villa-Vialaneix | Consensus Lasso 13/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Using partial correlations

strong indirect correlationy z

x

set.seed(2807); x <- rnorm(100)

y <- 2*x+1+rnorm(100,0,0.1); cor(x,y) [1] 0.998826

z <- 2*x+1+rnorm(100,0,0.1); cor(x,z) [1] 0.998751

cor(y,z) [1] 0.9971105

] Partial correlation

cor(lm(x∼z)$residuals,lm(y∼z)$residuals) [1] 0.7801174

cor(lm(x∼y)$residuals,lm(z∼y)$residuals) [1] 0.7639094

cor(lm(y∼x)$residuals,lm(z∼x)$residuals) [1] -0.1933699

Nathalie Villa-Vialaneix | Consensus Lasso 14/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Using partial correlations

strong indirect correlationy z

x

set.seed(2807); x <- rnorm(100)

y <- 2*x+1+rnorm(100,0,0.1); cor(x,y) [1] 0.998826

z <- 2*x+1+rnorm(100,0,0.1); cor(x,z) [1] 0.998751

cor(y,z) [1] 0.9971105

] Partial correlation

cor(lm(x∼z)$residuals,lm(y∼z)$residuals) [1] 0.7801174

cor(lm(x∼y)$residuals,lm(z∼y)$residuals) [1] 0.7639094

cor(lm(y∼x)$residuals,lm(z∼x)$residuals) [1] -0.1933699

Nathalie Villa-Vialaneix | Consensus Lasso 14/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Using partial correlations

strong indirect correlationy z

x

set.seed(2807); x <- rnorm(100)

y <- 2*x+1+rnorm(100,0,0.1); cor(x,y) [1] 0.998826

z <- 2*x+1+rnorm(100,0,0.1); cor(x,z) [1] 0.998751

cor(y,z) [1] 0.9971105

] Partial correlation

cor(lm(x∼z)$residuals,lm(y∼z)$residuals) [1] 0.7801174

cor(lm(x∼y)$residuals,lm(z∼y)$residuals) [1] 0.7639094

cor(lm(y∼x)$residuals,lm(z∼x)$residuals) [1] -0.1933699

Nathalie Villa-Vialaneix | Consensus Lasso 14/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Partial correlation and GGM

(Xi)i=1,...,n are i.i.d. Gaussian random variables N(0,Σ) (geneexpression); then

j ←→ j′(genes j and j′ are linked)⇔ Cor(X j ,X j′ |(Xk )k,j,j′

), 0

If (concentration matrix) S = Σ−1,

Cor(X j ,X j′ |(Xk )k,j,j′

)= −

Sjj′√SjjSj′ j′

⇒ Estimate Σ−1 to unravel the graph structure

Problem: Σ: p-dimensional matrix and n � p ⇒ (Σ̂n)−1 is a poorestimate of S)!

Nathalie Villa-Vialaneix | Consensus Lasso 15/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Partial correlation and GGM

(Xi)i=1,...,n are i.i.d. Gaussian random variables N(0,Σ) (geneexpression); then

j ←→ j′(genes j and j′ are linked)⇔ Cor(X j ,X j′ |(Xk )k,j,j′

), 0

If (concentration matrix) S = Σ−1,

Cor(X j ,X j′ |(Xk )k,j,j′

)= −

Sjj′√SjjSj′ j′

⇒ Estimate Σ−1 to unravel the graph structure

Problem: Σ: p-dimensional matrix and n � p ⇒ (Σ̂n)−1 is a poorestimate of S)!

Nathalie Villa-Vialaneix | Consensus Lasso 15/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Partial correlation and GGM

(Xi)i=1,...,n are i.i.d. Gaussian random variables N(0,Σ) (geneexpression); then

j ←→ j′(genes j and j′ are linked)⇔ Cor(X j ,X j′ |(Xk )k,j,j′

), 0

If (concentration matrix) S = Σ−1,

Cor(X j ,X j′ |(Xk )k,j,j′

)= −

Sjj′√SjjSj′ j′

⇒ Estimate Σ−1 to unravel the graph structure

Problem: Σ: p-dimensional matrix and n � p ⇒ (Σ̂n)−1 is a poorestimate of S)!

Nathalie Villa-Vialaneix | Consensus Lasso 15/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Various approaches for inferringnetworks with GGM

Graphical Gaussian Model

seminal work:[Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b](with shrinkage and a proposal for a Bayesian test of significance)

estimate Σ−1 by (Σ̂n + λI)−1

use a Bayesian test to test which coefficients are significantly non zero.

sparse approaches:[Meinshausen and Bühlmann, 2006, Friedman et al., 2008]:

∀ j, estimate the linear model:

with ‖βj‖L1 =∑

j′ |βjj′ |

L1 penalty yields to βjj′ = 0 for most j′ (variable selection)

Nathalie Villa-Vialaneix | Consensus Lasso 16/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Various approaches for inferringnetworks with GGM

Graphical Gaussian Model

seminal work:[Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b](with shrinkage and a proposal for a Bayesian test of significance)

estimate Σ−1 by (Σ̂n + λI)−1

use a Bayesian test to test which coefficients are significantly non zero.

relation with LM

sparse approaches:[Meinshausen and Bühlmann, 2006, Friedman et al., 2008]:

∀ j, estimate the linear model:

X j = βTj X−j + ε ; arg max

(βjj′ )j′(log MLj)

because βjj′ = −Sjj′

Sjj.

with ‖βj‖L1 =∑

j′ |βjj′ |

L1 penalty yields to βjj′ = 0 for most j′ (variable selection)

Nathalie Villa-Vialaneix | Consensus Lasso 16/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Various approaches for inferringnetworks with GGM

Graphical Gaussian Model

seminal work:[Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b](with shrinkage and a proposal for a Bayesian test of significance)

estimate Σ−1 by (Σ̂n + λI)−1

use a Bayesian test to test which coefficients are significantly non zero.

relation with LM

sparse approaches:[Meinshausen and Bühlmann, 2006, Friedman et al., 2008]:

∀ j, estimate the linear model:

X j = βTj X−j + ε ; arg min

(βjj′ )j′

n∑i=1

(Xij − β

Tj X−j

i

)2

because βjj′ = −Sjj′

Sjj.

with ‖βj‖L1 =∑

j′ |βjj′ |

L1 penalty yields to βjj′ = 0 for most j′ (variable selection)

Nathalie Villa-Vialaneix | Consensus Lasso 16/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Various approaches for inferringnetworks with GGM

Graphical Gaussian Model

seminal work:[Schäfer and Strimmer, 2005a, Schäfer and Strimmer, 2005b](with shrinkage and a proposal for a Bayesian test of significance)

estimate Σ−1 by (Σ̂n + λI)−1

use a Bayesian test to test which coefficients are significantly non zero.

sparse approaches:[Meinshausen and Bühlmann, 2006, Friedman et al., 2008]:∀ j, estimate the linear model:

X j = βTj X−j + ε ; arg min

(βjj′ )j′

n∑i=1

(Xij − β

Tj X−j

i

)2+λ‖βj‖L1

with ‖βj‖L1 =∑

j′ |βjj′ |

L1 penalty yields to βjj′ = 0 for most j′ (variable selection)

Nathalie Villa-Vialaneix | Consensus Lasso 16/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Outline

1 Short overview on biological background

2 Network inference and GGM

3 Inference with multiple samples

4 Simulations

Nathalie Villa-Vialaneix | Consensus Lasso 17/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Motivation for multiple networksinference

Pan-European project Diogenes2 (with Nathalie Viguerie, INSERM):gene expressions (lipid tissues) from 204 obese women before and aftera low-calorie diet (LCD).

Assumption: A commonfunctioning existsregardless thecondition;

Which genes are linkedindependentlyfrom/depending on thecondition?

2http://www.diogenes-eu.org/; see also [Viguerie et al., 2012]

Nathalie Villa-Vialaneix | Consensus Lasso 18/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Naive approach: independentestimations

Notations: p genes measured in k samples, each corresponding to aspecific condition: (Xc

j )j=1,...,p ∼ N(0,Σc), for c = 1, . . . , k .For c = 1, . . . , k , nc independent observations (Xc

ij )i=1,...,nc and∑c nc = n.

Independent inference

Estimation ∀ c = 1, . . . , k and ∀ j = 1, . . . , p,

Xcj = Xc

\j βcj + εc

j

are estimated (independently) by maximizing pseudo-likelihood:

L(S |X) =k∑

c=1

p∑j=1

nc∑i=1

logP(Xc

ij |Xci,\j , Sc

j

)

Nathalie Villa-Vialaneix | Consensus Lasso 19/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Related papers

Problem: previous estimation does not use the fact that the differentnetworks should be somehow alike!Previous proposals

[Chiquet et al., 2011] replace Σc by Σ̃c = 12 Σc + 1

2 Σ and add asparse penalty;

[Chiquet et al., 2011] LASSO and Group-LASSO type penalties toforce identical or sign-coherent edges between conditions[Danaher et al., 2013] add the penalty

∑c,c′ ‖Sc − Sc′‖L1 ⇒ (sparse

penalty over the concentration matrix entries forces to strongconsistency between conditions);[Mohan et al., 2012] add a group-LASSO like penalty∑

c,c′∑

j ‖Scj − Sc′

j ‖L2 that focuses on differences due to a fewnumber of nodes only.

Nathalie Villa-Vialaneix | Consensus Lasso 20/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Related papers

Problem: previous estimation does not use the fact that the differentnetworks should be somehow alike!Previous proposals

[Chiquet et al., 2011] replace Σc by Σ̃c = 12 Σc + 1

2 Σ and add asparse penalty;[Chiquet et al., 2011] LASSO and Group-LASSO type penalties toforce identical or sign-coherent edges between conditions:∑

jj′

√∑c

(Scjj′)

2 or∑

jj′

√∑

c

(Scjj′)

2+ +

√∑c

(Scjj′)

2−

⇒ Sc

jj′ = 0 ∀ c for most entries OR Scjj′ can only be of a given sign

(positive or negative) whatever c

[Danaher et al., 2013] add the penalty∑

c,c′ ‖Sc − Sc′‖L1 ⇒ (sparsepenalty over the concentration matrix entries forces to strongconsistency between conditions);[Mohan et al., 2012] add a group-LASSO like penalty∑

c,c′∑

j ‖Scj − Sc′

j ‖L2 that focuses on differences due to a fewnumber of nodes only.

Nathalie Villa-Vialaneix | Consensus Lasso 20/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Related papers

Problem: previous estimation does not use the fact that the differentnetworks should be somehow alike!Previous proposals

[Chiquet et al., 2011] replace Σc by Σ̃c = 12 Σc + 1

2 Σ and add asparse penalty;[Chiquet et al., 2011] LASSO and Group-LASSO type penalties toforce identical or sign-coherent edges between conditions[Danaher et al., 2013] add the penalty

∑c,c′ ‖Sc − Sc′‖L1 ⇒ (sparse

penalty over the concentration matrix entries forces to strongconsistency between conditions);

[Mohan et al., 2012] add a group-LASSO like penalty∑c,c′

∑j ‖Sc

j − Sc′j ‖L2 that focuses on differences due to a few

number of nodes only.

Nathalie Villa-Vialaneix | Consensus Lasso 20/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Related papers

Problem: previous estimation does not use the fact that the differentnetworks should be somehow alike!Previous proposals

[Chiquet et al., 2011] replace Σc by Σ̃c = 12 Σc + 1

2 Σ and add asparse penalty;[Chiquet et al., 2011] LASSO and Group-LASSO type penalties toforce identical or sign-coherent edges between conditions[Danaher et al., 2013] add the penalty

∑c,c′ ‖Sc − Sc′‖L1 ⇒ (sparse

penalty over the concentration matrix entries forces to strongconsistency between conditions);[Mohan et al., 2012] add a group-LASSO like penalty∑

c,c′∑

j ‖Scj − Sc′

j ‖L2 that focuses on differences due to a fewnumber of nodes only.

Nathalie Villa-Vialaneix | Consensus Lasso 20/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Consensus LASSO

Proposal

Infer multiple networks by forcing them toward a consensual network: i.e.,explicitly keeping the differences between conditions under control butwith a L2 penalty (allow for more differences than Group-LASSO typepenalties).

Original optimization:

max(βc

jk )k,j,c=1,...,C

∑c

log MLcj − λ

∑k,j

|βcjk |

.

Add a constraint to force inference toward a “consensus” βcons

12βT

j Σ̂\j\jβj + βTj Σ̂j\j + λ‖βj‖L1 + µ

∑c

wc‖βcj − β

consj ‖2L2

with:wc : real number used to weight the conditions;µ regularization parameter;βcons

j whatever you want...?

Nathalie Villa-Vialaneix | Consensus Lasso 21/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Consensus LASSO

Proposal

Infer multiple networks by forcing them toward a consensual network: i.e.,explicitly keeping the differences between conditions under control butwith a L2 penalty (allow for more differences than Group-LASSO typepenalties).

Original optimization:

max(βc

jk )k,j,c=1,...,C

∑c

log MLcj − λ

∑k,j

|βcjk |

.[Ambroise et al., 2009, Chiquet et al., 2011]: is equivalent to minimizep problems having dimension k(p − 1):

12βT

j Σ̂\j\jβj + βTj Σ̂j\j + λ‖βj‖L1 , βj = (β1

j , . . . , βkj )

with Σ̂\j\j : block diagonal matrix Diag(Σ̂1\j\j , . . . , Σ̂

k\j\j

)and similarly for Σ̂j\j .

Add a constraint to force inference toward a “consensus” βcons

12βT

j Σ̂\j\jβj + βTj Σ̂j\j + λ‖βj‖L1 + µ

∑c

wc‖βcj − β

consj ‖2L2

with:wc : real number used to weight the conditions;µ regularization parameter;βcons

j whatever you want...?

Nathalie Villa-Vialaneix | Consensus Lasso 21/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Consensus LASSO

Proposal

Infer multiple networks by forcing them toward a consensual network: i.e.,explicitly keeping the differences between conditions under control butwith a L2 penalty (allow for more differences than Group-LASSO typepenalties).

Add a constraint to force inference toward a “consensus” βcons

12βT

j Σ̂\j\jβj + βTj Σ̂j\j + λ‖βj‖L1 + µ

∑c

wc‖βcj − β

consj ‖2L2

with:wc : real number used to weight the conditions;µ regularization parameter;βcons

j whatever you want...?

Nathalie Villa-Vialaneix | Consensus Lasso 21/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Choice of a consensus: set one...

Typical case:

a prior network is known (e.g., from bibliography);

with no prior information, use a fixed prior corresponding to (e.g.)global inference

⇒ given (and fixed) βcons

Proposition

Using a fixed βconsj , the optimization problem is equivalent to minimizing

the p following standard quadratic problem in Rk(p−1) with L1-penalty:

12βT

j B1(µ)βj + βTj B2(µ) + λ‖βj‖L1 ,

where

B1(µ) = Σ̂\j\j + 2µIk(p−1), with Ik(p−1) the k(p − 1)-identity matrix

B2(µ) = Σ̂j\j − 2µIk(p−1)βcons with βcons =

((βcons

j )T , . . . , (βconsj )T

)T

Nathalie Villa-Vialaneix | Consensus Lasso 22/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Choice of a consensus: set one...

Typical case:

a prior network is known (e.g., from bibliography);

with no prior information, use a fixed prior corresponding to (e.g.)global inference

⇒ given (and fixed) βcons

Proposition

Using a fixed βconsj , the optimization problem is equivalent to minimizing

the p following standard quadratic problem in Rk(p−1) with L1-penalty:

12βT

j B1(µ)βj + βTj B2(µ) + λ‖βj‖L1 ,

where

B1(µ) = Σ̂\j\j + 2µIk(p−1), with Ik(p−1) the k(p − 1)-identity matrix

B2(µ) = Σ̂j\j − 2µIk(p−1)βcons with βcons =

((βcons

j )T , . . . , (βconsj )T

)T

Nathalie Villa-Vialaneix | Consensus Lasso 22/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Choice of a consensus: adapt oneduring training...

Derive the consensus from the condition-specific estimates:

βconsj =

∑c

nc

nβc

j

Proposition

Using βconsj =

∑kc=1

ncn β

cj , the optimization problem is equivalent to

minimizing the following standard quadratic problem with L1-penalty:

12βT

j Sj(µ)βj + βTj Σ̂j\j + λ‖βj‖L1

where Sj(µ) = Σ̂\j\j + 2µAT (µ)A(µ) where A(µ) is a[k(p − 1) × k(p − 1)]-matrix that does not depend on j.

Nathalie Villa-Vialaneix | Consensus Lasso 23/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Choice of a consensus: adapt oneduring training...

Derive the consensus from the condition-specific estimates:

βconsj =

∑c

nc

nβc

j

Proposition

Using βconsj =

∑kc=1

ncn β

cj , the optimization problem is equivalent to

minimizing the following standard quadratic problem with L1-penalty:

12βT

j Sj(µ)βj + βTj Σ̂j\j + λ‖βj‖L1

where Sj(µ) = Σ̂\j\j + 2µAT (µ)A(µ) where A(µ) is a[k(p − 1) × k(p − 1)]-matrix that does not depend on j.

Nathalie Villa-Vialaneix | Consensus Lasso 23/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Computational aspects: optimization

Common framework

Objective function can be decomposed into:

convex part C(βj) = 12β

Tj Q

1j (µ) + βT

j Q2j (µ)

L1-norm penalty P(βj) = ‖βj‖L1

optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011]

1: repeat(λ given)2: Given A and βjj′ st: βjj′ , 0, ∀ j′ ∈ A, solve (over h) the smooth

minimization problem restricted to A

C(βj + h) + λP(βj + h) ⇒ βj ← βj + h

3: Update A by adding most violating variables, i.e., variables st:

abs∣∣∣∂C(βj) + λ∂P(βj)

∣∣∣ > 0

with [∂P(βj)]j′ ∈ [−1, 1] if j′ < A

4: until

all first order conditions satisfied

Repeat:

large λ

small λusing previous βas prior

Nathalie Villa-Vialaneix | Consensus Lasso 24/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Computational aspects: optimization

Common framework

Objective function can be decomposed into:

convex part C(βj) = 12β

Tj Q

1j (µ) + βT

j Q2j (µ)

L1-norm penalty P(βj) = ‖βj‖L1

optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011]

1: repeat(λ given)2: Given A and βjj′ st: βjj′ , 0, ∀ j′ ∈ A, solve (over h) the smooth

minimization problem restricted to A

C(βj + h) + λP(βj + h) ⇒ βj ← βj + h

3: Update A by adding most violating variables, i.e., variables st:

abs∣∣∣∂C(βj) + λ∂P(βj)

∣∣∣ > 0

with [∂P(βj)]j′ ∈ [−1, 1] if j′ < A

4: until

all first order conditions satisfied

Repeat:

large λ

small λusing previous βas prior

Nathalie Villa-Vialaneix | Consensus Lasso 24/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Computational aspects: optimization

Common framework

Objective function can be decomposed into:

convex part C(βj) = 12β

Tj Q

1j (µ) + βT

j Q2j (µ)

L1-norm penalty P(βj) = ‖βj‖L1

optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011]

1: repeat(λ given)2: Given A and βjj′ st: βjj′ , 0, ∀ j′ ∈ A, solve (over h) the smooth

minimization problem restricted to A

C(βj + h) + λP(βj + h) ⇒ βj ← βj + h

3: Update A by adding most violating variables, i.e., variables st:

abs∣∣∣∂C(βj) + λ∂P(βj)

∣∣∣ > 0

with [∂P(βj)]j′ ∈ [−1, 1] if j′ < A4: until

all first order conditions satisfied

Repeat:

large λ

small λusing previous βas prior

Nathalie Villa-Vialaneix | Consensus Lasso 24/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Computational aspects: optimization

Common framework

Objective function can be decomposed into:

convex part C(βj) = 12β

Tj Q

1j (µ) + βT

j Q2j (µ)

L1-norm penalty P(βj) = ‖βj‖L1

optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011]

1: repeat(λ given)2: Given A and βjj′ st: βjj′ , 0, ∀ j′ ∈ A, solve (over h) the smooth

minimization problem restricted to A

C(βj + h) + λP(βj + h) ⇒ βj ← βj + h

3: Update A by adding most violating variables, i.e., variables st:

abs∣∣∣∂C(βj) + λ∂P(βj)

∣∣∣ > 0

with [∂P(βj)]j′ ∈ [−1, 1] if j′ < A4: until all first order conditions satisfied

Repeat:

large λ

small λusing previous βas prior

Nathalie Villa-Vialaneix | Consensus Lasso 24/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Computational aspects: optimization

Common framework

Objective function can be decomposed into:

convex part C(βj) = 12β

Tj Q

1j (µ) + βT

j Q2j (µ)

L1-norm penalty P(βj) = ‖βj‖L1

optimization by “active set” [Osborne et al., 2000, Chiquet et al., 2011]

1: repeat(λ given)2: Given A and βjj′ st: βjj′ , 0, ∀ j′ ∈ A, solve (over h) the smooth

minimization problem restricted to A

C(βj + h) + λP(βj + h) ⇒ βj ← βj + h

3: Update A by adding most violating variables, i.e., variables st:

abs∣∣∣∂C(βj) + λ∂P(βj)

∣∣∣ > 0

with [∂P(βj)]j′ ∈ [−1, 1] if j′ < A4: until all first order conditions satisfied

Repeat:

large λ

small λusing previous βas prior

Nathalie Villa-Vialaneix | Consensus Lasso 24/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Bootstrap estimation ' BOLASSO

x x

xx

x

x

x

x

subsample n observations with replacement

cLasso estimation−−−−−−−−−−−−→ for varying λ, (βλ,bjj′ )jj′

↓ threshold

keep the first T1 largest coefficients

Frequency table(1, 2) (1, 3) ... (j, j′) ...130 25 ... 120 ...

−→ Keep the T2 most frequent pairs

Nathalie Villa-Vialaneix | Consensus Lasso 25/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Bootstrap estimation ' BOLASSO

x x

xx

x

x

x

x

subsample n observations with replacement

cLasso estimation−−−−−−−−−−−−→ for varying λ, (βλ,bjj′ )jj′

↓ threshold

keep the first T1 largest coefficients

Frequency table(1, 2) (1, 3) ... (j, j′) ...130 25 ... 120 ...

−→ Keep the T2 most frequent pairs

Nathalie Villa-Vialaneix | Consensus Lasso 25/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Bootstrap estimation ' BOLASSO

x x

xx

x

x

x

x

subsample n observations with replacement

cLasso estimation−−−−−−−−−−−−→ for varying λ, (βλ,bjj′ )jj′

↓ threshold

keep the first T1 largest coefficients

Frequency table(1, 2) (1, 3) ... (j, j′) ...130 25 ... 120 ...

−→ Keep the T2 most frequent pairs

Nathalie Villa-Vialaneix | Consensus Lasso 25/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Bootstrap estimation ' BOLASSO

x x

xx

x

x

x

x

subsample n observations with replacement

cLasso estimation−−−−−−−−−−−−→ for varying λ, (βλ,bjj′ )jj′

↓ threshold

keep the first T1 largest coefficients

Frequency table(1, 2) (1, 3) ... (j, j′) ...130 25 ... 120 ...

Frequency table(1, 2) (1, 3) ... (j, j′) ...130 25 ... 120 ...

−→ Keep the T2 most frequent pairs

Nathalie Villa-Vialaneix | Consensus Lasso 25/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Bootstrap estimation ' BOLASSO

x x

xx

x

x

x

x

subsample n observations with replacement

cLasso estimation−−−−−−−−−−−−→ for varying λ, (βλ,bjj′ )jj′

↓ threshold

keep the first T1 largest coefficients

Frequency table(1, 2) (1, 3) ... (j, j′) ...130 25 ... 120 ...

−→ Keep the T2 most frequent pairs

Nathalie Villa-Vialaneix | Consensus Lasso 25/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Outline

1 Short overview on biological background

2 Network inference and GGM

3 Inference with multiple samples

4 Simulations

Nathalie Villa-Vialaneix | Consensus Lasso 26/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Simulated data

Expression data with known co-expression network

original network (scale free) taken fromhttp://www.comp-sys-bio.org/AGN/data.html (100 nodes,∼ 200 edges, loops removed);

rewire a ratio r of the edges to generate k “children” networks(sharing approximately 100(1 − 2r)% of their edges);

generate “expression data” with a random Gaussian process fromeach chid:

use the Laplacian of the graph to generate a putative concentrationmatrix;use edge colors in the original network to set the edge sign;correct the obtained matrix to make it positive;invert to obtain a covariance matrix...;... which is used in a random Gaussian process to generateexpression data (with noise).

Nathalie Villa-Vialaneix | Consensus Lasso 27/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Simulated data

Expression data with known co-expression network

original network (scale free) taken fromhttp://www.comp-sys-bio.org/AGN/data.html (100 nodes,∼ 200 edges, loops removed);

rewire a ratio r of the edges to generate k “children” networks(sharing approximately 100(1 − 2r)% of their edges);generate “expression data” with a random Gaussian process fromeach chid:

use the Laplacian of the graph to generate a putative concentrationmatrix;use edge colors in the original network to set the edge sign;correct the obtained matrix to make it positive;invert to obtain a covariance matrix...;... which is used in a random Gaussian process to generateexpression data (with noise).

Nathalie Villa-Vialaneix | Consensus Lasso 27/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

An example with k = 2, r = 5%

mother network3 first child second child

3actually the parent network. My co-author wisely noted that the mistake wasunforgivable for a feminist...

Nathalie Villa-Vialaneix | Consensus Lasso 28/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Choice for T2

Data: r = 0.05, k = 2 and n1 = n2 = 20100 bootstrap samples, µ = 1, T1 = 250 or 500

●● 30 selections at least30 selections at least

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75precision

reca

ll

●●

40 selections at least43 selections at least

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00precision

reca

ll

Dots correspond to best F = 2 × precision×recallprecision+recall

⇒ Best F corresponds to selecting a number of edges approximatelyequal to the number of edges in the original network.

Nathalie Villa-Vialaneix | Consensus Lasso 29/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Choice for T1 and µ

µ T1 % of improvement0.1/1 {250, 300, 500} of bootstrapping

network sizes rewired edges: 5%20-20 1 500 30.6920-30 0.1 500 11.8730-30 1 300 20.1550-50 1 300 14.3620-20-20-20-20 1 500 86.0430-30-30-30 0.1 500 42.67network sizes rewired edges: 20%20-20 0.1 300 -17.8620-30 0.1 300 -18.3530-30 1 500 -7.9750-50 0.1 300 -7.8320-20-20-20-20 0.1 500 10.2730-30-30-30 1 500 13.48

Nathalie Villa-Vialaneix | Consensus Lasso 30/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Comparison with other approaches

Method compared (direct and bootstrap approaches)

independant Graphical LASSO estimation gLasso

methods implementated in the R package simone and described in[Chiquet et al., 2011]: intertwinned LASSO iLasso, cooperativeLASSO coopLasso and group LASSO groupLasso

fused graphical LASSO as described in [Danaher et al., 2013] asimplemented in the R package fgLasso

consensus Lasso withfixed prior (the mother network) cLasso(p)fixed prior (average over the conditions of independant estimations)cLasso(2)adaptative estimation of the prior cLasso(m)

Parameters set to: T1 = 500, B = 100, µ = 1

Nathalie Villa-Vialaneix | Consensus Lasso 31/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Comparison with other approaches

Method compared (direct and bootstrap approaches)

independant Graphical LASSO estimation gLasso

methods implementated in the R package simone and described in[Chiquet et al., 2011]: intertwinned LASSO iLasso, cooperativeLASSO coopLasso and group LASSO groupLasso

fused graphical LASSO as described in [Danaher et al., 2013] asimplemented in the R package fgLassoconsensus Lasso with

fixed prior (the mother network) cLasso(p)fixed prior (average over the conditions of independant estimations)cLasso(2)adaptative estimation of the prior cLasso(m)

Parameters set to: T1 = 500, B = 100, µ = 1

Nathalie Villa-Vialaneix | Consensus Lasso 31/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Comparison with other approaches

Method compared (direct and bootstrap approaches)

independant Graphical LASSO estimation gLasso

methods implementated in the R package simone and described in[Chiquet et al., 2011]: intertwinned LASSO iLasso, cooperativeLASSO coopLasso and group LASSO groupLasso

fused graphical LASSO as described in [Danaher et al., 2013] asimplemented in the R package fgLassoconsensus Lasso with

fixed prior (the mother network) cLasso(p)fixed prior (average over the conditions of independant estimations)cLasso(2)adaptative estimation of the prior cLasso(m)

Parameters set to: T1 = 500, B = 100, µ = 1

Nathalie Villa-Vialaneix | Consensus Lasso 31/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Selected results (best F)

rewired edges: 5% - conditions: 2 - sample size: 2 × 30direct version

Method gLasso iLasso groupLasso coopLasso0.28 0.35 0.32 0.35

Method fgLasso cLasso(m) cLasso(p) cLasso(2)0.32 0.31 0.86 0.30

bootstrap versionMethod gLasso iLasso groupLasso coopLasso

0.31 0.34 0.36 0.34Method fgLasso cLasso(m) cLasso(p) cLasso(2)

0.36 0.37 0.86 0.35

Conclusions

bootstraping improves performance (except for iLasso and for larger)

joint inference improves performance

using a good prior is (as expected) very efficient

adaptive approach for cLasso is better than naive approach

Nathalie Villa-Vialaneix | Consensus Lasso 32/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Real data

204 obese women ; expression of 221 genes before and after a LCDµ = 1 ; T1 = 1000 (target density: 4%)

Distribution of the number of times an edge is selected over 100bootstrap samples

0

500

1000

1500

0 25 50 75 100Counts

Fre

quen

cy

(70% of the pairs of nodes are never selected)⇒ T2 = 80

Nathalie Villa-Vialaneix | Consensus Lasso 33/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Networks

Before diet

●●

●●

● ●

● ●

● ●

●●

AZGP1

CIDEA

FADS1

FADS2

GPD1L

PCK2

After diet

●●

●●

● ●

● ●

● ●

●●

AZGP1

CIDEA

FADS1

FADS2

GPD1L

PCK2

densities about 1.3% - some interactions (both shared and specific)make sense to the biologist

Nathalie Villa-Vialaneix | Consensus Lasso 34/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Thank you for your attention...

Programs available in the R package therese (on R-Forge)4. Joint workwith

Magali SanCristobal Matthieu Vignes(GenPhySe, INRA Toulouse) (Massey University, NZ)

Nathalie Viguerie(I2MC, INSERM Toulouse)

4https://r-forge.r-project.org/projects/therese-pkg

Nathalie Villa-Vialaneix | Consensus Lasso 35/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

Questions?

Nathalie Villa-Vialaneix | Consensus Lasso 36/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

ReferencesAmbroise, C., Chiquet, J., and Matias, C. (2009).

Inferring sparse Gaussian graphical models with latent structure.Electronic Journal of Statistics, 3:205–238.

Butte, A. and Kohane, I. (1999).

Unsupervised knowledge discovery in medical databases using relevance networks.In Proceedings of the AMIA Symposium, pages 711–715.

Chiquet, J., Grandvalet, Y., and Ambroise, C. (2011).

Inferring multiple graphical structures.Statistics and Computing, 21(4):537–553.

Danaher, P., Wang, P., and Witten, D. (2013).

The joint graphical lasso for inverse covariance estimation accross multiple classes.Journal of the Royal Statistical Society Series B.Forthcoming.

Friedman, J., Hastie, T., and Tibshirani, R. (2008).

Sparse inverse covariance estimation with the graphical lasso.Biostatistics, 9(3):432–441.

Meinshausen, N. and Bühlmann, P. (2006).

High dimensional graphs and variable selection with the lasso.Annals of Statistic, 34(3):1436–1462.

Mohan, K., Chung, J., Han, S., Witten, D., Lee, S., and Fazel, M. (2012).

Structured learning of Gaussian graphical models.In Proceedings of NIPS (Neural Information Processing Systems) 2012, Lake Tahoe, Nevada, USA.

Osborne, M., Presnell, B., and Turlach, B. (2000).

Nathalie Villa-Vialaneix | Consensus Lasso 36/36

Short overview on biological background Network inference and GGM Inference with multiple samples Simulations

On the LASSO and its dual.Journal of Computational and Graphical Statistics, 9(2):319–337.

Schäfer, J. and Strimmer, K. (2005a).

An empirical bayes approach to inferring large-scale gene association networks.Bioinformatics, 21(6):754–764.

Schäfer, J. and Strimmer, K. (2005b).

A shrinkage approach to large-scale covariance matrix estimation and implication for functional genomics.Statistical Applications in Genetics and Molecular Biology, 4:1–32.

Viguerie, N., Montastier, E., Maoret, J., Roussel, B., Combes, M., Valle, C., Villa-Vialaneix, N., Iacovoni, J., Martinez, J., Holst, C., Astrup,

A., Vidal, H., Clément, K., Hager, J., Saris, W., and Langin, D. (2012).Determinants of human adipose tissue gene expression: impact of diet, sex, metabolic status and cis genetic regulation.PLoS Genetics, 8(9):e1002959.

Nathalie Villa-Vialaneix | Consensus Lasso 36/36