NetBioSIG2014-Talk by Hyunghoon Cho

Post on 10-May-2015

448 views 3 download

Tags:

description

NetBioSIG2014 at ISMB in Boston, MA, USA on July 11, 2014

Transcript of NetBioSIG2014-Talk by Hyunghoon Cho

Identifying context-dependent community structure across

multiple networks

Hyunghoon Cho, Gerald Quon, Bonnie Berger, Manolis Kellis

MIT CSAIL

ISMB Network Biology SIGJuly 11th, 2014

Modules / communitiesCellular functions are carried out by groups of biomolecules (e.g., proteins, RNA) acting in a coordinated fashion.

Problem: how does this structure change under a different condition?

Detecting changes in modules

1 2

3

1

2

3

Context

Mod

ule

1 2 Kv v v

Approaches to module detection• Many algorithms for detecting modules in a single network

– Link clustering [Shi et al. 2013], label propagation [Gregory 2010], Tensor decomposition [Anandkumar et al. 2013], mixed-membership stochastic blockmodels [Airoldi et al. 2008], etc.

• Not obvious how to extend to the multiple network case:

Combine networks,then detect modules

likely to missrare modules

Detect modules,then combine results

inconsistentmodule definition

Multi-MMSBJointly learns modules from all networks, allow each to be only present in a subset

of networks

Model description: SB

Note: each node belongs to a single module

Adjacency matrix

Model description: MMSB

[Airoldi et al., 2008]

Model description: Multi-MMSB

Learning the model

Goal: optimize model likelihood

Expectation-Maximization algorithm to deal with latent variables

Need variational approximation

Random restarts to alleviate local optima issue

Performance metric

• Normalized mutual information (NMI)

Sequence ofstructural queries

Learnedcommunitystructure

Truecommunitystructure

Answers

Answers

Calculatemutual information

[Esquivel and Rosvall, 2012]

Synthetic data: resultsN

orm

alize

d m

utua

l inf

orm

ation

Synthetic data: results

Synthetic data: results

Synthetic data: results

Asthma data (GSE19301)

Microarray profiling of peripheral blood mononuclear cells from asthma patients at 3 different stages:

• quiet: 394 samples• exacerbation: 125 samples• follow-up (2 weeks after exacerbation): 166 samples

[Bjornsdottir et al., 2011]

Asthma data: results

RNA decay data (GSE37451)

Microarray profiling of 70 lymphoblastoid cell lines at 5 different timepoints after transcription arrest:

• 0 hr (before transcription arrest)• 0.5 hr • 1 hr• 2 hr• 4 hr

RNA decay data: results

Summary

• We developed Multi-MMSB, a flexible way of learning community structure over multiple networks

• Multi-MMSB outperformed naive methods on synthetic data

• When applied to real data, Multi-MMSB identified context-specific modules that are biologically plausible

Future directions

• Extending the model:– Directed networks– Weighted edges

• Application to other types of biological networks:– Regulatory networks– PPI

Acknowledgements

• Gerald Quon• Prof. Bonnie Berger• Prof. Manolis Kellis