NetBioSIG2014-Talk by Hyunghoon Cho

20
Identifying context- dependent community structure across multiple networks Hyunghoon Cho , Gerald Quon, Bonnie Berger, Manolis Kellis MIT CSAIL ISMB Network Biology SIG July 11 th , 2014

description

NetBioSIG2014 at ISMB in Boston, MA, USA on July 11, 2014

Transcript of NetBioSIG2014-Talk by Hyunghoon Cho

Page 1: NetBioSIG2014-Talk by Hyunghoon Cho

Identifying context-dependent community structure across

multiple networks

Hyunghoon Cho, Gerald Quon, Bonnie Berger, Manolis Kellis

MIT CSAIL

ISMB Network Biology SIGJuly 11th, 2014

Page 2: NetBioSIG2014-Talk by Hyunghoon Cho

Modules / communitiesCellular functions are carried out by groups of biomolecules (e.g., proteins, RNA) acting in a coordinated fashion.

Problem: how does this structure change under a different condition?

Page 3: NetBioSIG2014-Talk by Hyunghoon Cho

Detecting changes in modules

1 2

3

1

2

3

Context

Mod

ule

1 2 Kv v v

Page 4: NetBioSIG2014-Talk by Hyunghoon Cho

Approaches to module detection• Many algorithms for detecting modules in a single network

– Link clustering [Shi et al. 2013], label propagation [Gregory 2010], Tensor decomposition [Anandkumar et al. 2013], mixed-membership stochastic blockmodels [Airoldi et al. 2008], etc.

• Not obvious how to extend to the multiple network case:

Combine networks,then detect modules

likely to missrare modules

Detect modules,then combine results

inconsistentmodule definition

Multi-MMSBJointly learns modules from all networks, allow each to be only present in a subset

of networks

Page 5: NetBioSIG2014-Talk by Hyunghoon Cho

Model description: SB

Note: each node belongs to a single module

Adjacency matrix

Page 6: NetBioSIG2014-Talk by Hyunghoon Cho

Model description: MMSB

[Airoldi et al., 2008]

Page 7: NetBioSIG2014-Talk by Hyunghoon Cho

Model description: Multi-MMSB

Page 8: NetBioSIG2014-Talk by Hyunghoon Cho

Learning the model

Goal: optimize model likelihood

Expectation-Maximization algorithm to deal with latent variables

Need variational approximation

Random restarts to alleviate local optima issue

Page 9: NetBioSIG2014-Talk by Hyunghoon Cho

Performance metric

• Normalized mutual information (NMI)

Sequence ofstructural queries

Learnedcommunitystructure

Truecommunitystructure

Answers

Answers

Calculatemutual information

[Esquivel and Rosvall, 2012]

Page 10: NetBioSIG2014-Talk by Hyunghoon Cho

Synthetic data: resultsN

orm

alize

d m

utua

l inf

orm

ation

Page 11: NetBioSIG2014-Talk by Hyunghoon Cho

Synthetic data: results

Page 12: NetBioSIG2014-Talk by Hyunghoon Cho

Synthetic data: results

Page 13: NetBioSIG2014-Talk by Hyunghoon Cho

Synthetic data: results

Page 14: NetBioSIG2014-Talk by Hyunghoon Cho

Asthma data (GSE19301)

Microarray profiling of peripheral blood mononuclear cells from asthma patients at 3 different stages:

• quiet: 394 samples• exacerbation: 125 samples• follow-up (2 weeks after exacerbation): 166 samples

[Bjornsdottir et al., 2011]

Page 15: NetBioSIG2014-Talk by Hyunghoon Cho

Asthma data: results

Page 16: NetBioSIG2014-Talk by Hyunghoon Cho

RNA decay data (GSE37451)

Microarray profiling of 70 lymphoblastoid cell lines at 5 different timepoints after transcription arrest:

• 0 hr (before transcription arrest)• 0.5 hr • 1 hr• 2 hr• 4 hr

Page 17: NetBioSIG2014-Talk by Hyunghoon Cho

RNA decay data: results

Page 18: NetBioSIG2014-Talk by Hyunghoon Cho

Summary

• We developed Multi-MMSB, a flexible way of learning community structure over multiple networks

• Multi-MMSB outperformed naive methods on synthetic data

• When applied to real data, Multi-MMSB identified context-specific modules that are biologically plausible

Page 19: NetBioSIG2014-Talk by Hyunghoon Cho

Future directions

• Extending the model:– Directed networks– Weighted edges

• Application to other types of biological networks:– Regulatory networks– PPI

Page 20: NetBioSIG2014-Talk by Hyunghoon Cho

Acknowledgements

• Gerald Quon• Prof. Bonnie Berger• Prof. Manolis Kellis