NetBioSIG2014-Talk by Hyunghoon Cho
-
Upload
alexander-pico -
Category
Science
-
view
448 -
download
3
description
Transcript of NetBioSIG2014-Talk by Hyunghoon Cho
Identifying context-dependent community structure across
multiple networks
Hyunghoon Cho, Gerald Quon, Bonnie Berger, Manolis Kellis
MIT CSAIL
ISMB Network Biology SIGJuly 11th, 2014
Modules / communitiesCellular functions are carried out by groups of biomolecules (e.g., proteins, RNA) acting in a coordinated fashion.
Problem: how does this structure change under a different condition?
Detecting changes in modules
1 2
3
1
2
3
Context
Mod
ule
1 2 Kv v v
Approaches to module detection• Many algorithms for detecting modules in a single network
– Link clustering [Shi et al. 2013], label propagation [Gregory 2010], Tensor decomposition [Anandkumar et al. 2013], mixed-membership stochastic blockmodels [Airoldi et al. 2008], etc.
• Not obvious how to extend to the multiple network case:
Combine networks,then detect modules
likely to missrare modules
Detect modules,then combine results
inconsistentmodule definition
Multi-MMSBJointly learns modules from all networks, allow each to be only present in a subset
of networks
Model description: SB
Note: each node belongs to a single module
Adjacency matrix
Model description: MMSB
[Airoldi et al., 2008]
Model description: Multi-MMSB
Learning the model
Goal: optimize model likelihood
Expectation-Maximization algorithm to deal with latent variables
Need variational approximation
Random restarts to alleviate local optima issue
Performance metric
• Normalized mutual information (NMI)
Sequence ofstructural queries
Learnedcommunitystructure
Truecommunitystructure
Answers
Answers
Calculatemutual information
[Esquivel and Rosvall, 2012]
Synthetic data: resultsN
orm
alize
d m
utua
l inf
orm
ation
Synthetic data: results
Synthetic data: results
Synthetic data: results
Asthma data (GSE19301)
Microarray profiling of peripheral blood mononuclear cells from asthma patients at 3 different stages:
• quiet: 394 samples• exacerbation: 125 samples• follow-up (2 weeks after exacerbation): 166 samples
[Bjornsdottir et al., 2011]
Asthma data: results
RNA decay data (GSE37451)
Microarray profiling of 70 lymphoblastoid cell lines at 5 different timepoints after transcription arrest:
• 0 hr (before transcription arrest)• 0.5 hr • 1 hr• 2 hr• 4 hr
RNA decay data: results
Summary
• We developed Multi-MMSB, a flexible way of learning community structure over multiple networks
• Multi-MMSB outperformed naive methods on synthetic data
• When applied to real data, Multi-MMSB identified context-specific modules that are biologically plausible
Future directions
• Extending the model:– Directed networks– Weighted edges
• Application to other types of biological networks:– Regulatory networks– PPI
Acknowledgements
• Gerald Quon• Prof. Bonnie Berger• Prof. Manolis Kellis