009_20150201_Structural Inference for Uncertain Networks

23
Structural Inference for Uncertain Networks Tran Quoc Hoan @k09ht haduonght.wordpress.com/ 1 February 2016, Paper Alert, Hasegawa lab., Tokyo The University of Tokyo Travis martin, Brian Ball, and M. E. J. Newman Phys. Rev. E 93, 012306 – Published 15 January 2016

Transcript of 009_20150201_Structural Inference for Uncertain Networks

Page 1: 009_20150201_Structural Inference for Uncertain Networks

Structural Inference for Uncertain Networks

Tran Quoc Hoan

@k09ht haduonght.wordpress.com/

1 February 2016, Paper Alert, Hasegawa lab., Tokyo

The University of Tokyo

Travis martin, Brian Ball, and M. E. J. Newman Phys. Rev. E 93, 012306 – Published 15 January 2016

Page 2: 009_20150201_Structural Inference for Uncertain Networks

Abstract

Structural Inference for Uncertain Networks 2

“… Rather than knowing the structure of a network exactly,

we know the connections between nodes only with a

certain probability. In this paper we develop methods for

the analysis of such uncertain data, focusing particularly

on the problem of community detection…”

“…We give a principled maximum-likelihood method for

inferring community structure and demonstrate how the

results can be used to make improved estimates of the true

structure of the network.…”

Page 3: 009_20150201_Structural Inference for Uncertain Networks

Outline

3

- Analyze the networks represented by uncertain measurements of their edges

• Motivation

- Fitting a generative network model to the data using a combination of an EM algorithm and belief propagation

• Proposal

- Reconstruct underlaying structure of network (community detection, edge recovery, …)

• Applications

Structural Inference for Uncertain Networks

Page 4: 009_20150201_Structural Inference for Uncertain Networks

Focus problem

4Structural Inference for Uncertain Networks

• Uncertain structure network

• Community detectioni j

prob of exist edge = Qij

- Classify the nodes into non-overlapping communities

- Communities = groups of nodes with dense connection within groups and sparse connections between groups

Noisy representation of true network

Generative model for uncertain community-structured networks

Fit model to observed data

Communitystructure

Trivial approach = threshold

Page 5: 009_20150201_Structural Inference for Uncertain Networks

Model

5Structural Inference for Uncertain Networks

• Stochastic block model- n nodes are distributed at random among k groups

- γr : probability to assign to group r kX

r=1

�r = 1

- wrs : probability to place undirected edges (depends only to group r, s)

- If wrr >> wrs (r ≠ s) then the network has traditional assortative community structure

- Probability to generate a network (given γr wrs) in which node i is assigned to group gi, and with the adjacency matrix A

Aij = 1 if there is an edge

(1)

Page 6: 009_20150201_Structural Inference for Uncertain Networks

Model

6Structural Inference for Uncertain Networks

• Generative model

- Each pair of nodes i, j a probability Qij of being connected by an edge, drawn from different distributions for edges Aij = 1 and non-edges Aij = 0

- Probability that a true network represented by A = {Aij} become to a matrix of observed edge probabilities Q = {Qij}

Page 7: 009_20150201_Structural Inference for Uncertain Networks

Model

7Structural Inference for Uncertain Networks

• Generative modelNumber of edges with observed probability between Q and Q + dQ

Number of non-edges with observed probability between Q and Q + dQ

A value of Qij (assumed independent)

m: total number of edges in underlying true network

then

whereX

i<j

Qijand m can be approximated by

Page 8: 009_20150201_Structural Inference for Uncertain Networks

Model

8Structural Inference for Uncertain Networks

• Generative model

From (2) and (4)

Constant

Likelihood

From (1) and (6)

Page 9: 009_20150201_Structural Inference for Uncertain Networks

Methods

9Structural Inference for Uncertain Networks

• Fitting to empirical data

Maximize margin likelihood

Jensen’s inequality

Page 10: 009_20150201_Structural Inference for Uncertain Networks

Methods

10Structural Inference for Uncertain Networks

• Fitting to empirical data

Page 11: 009_20150201_Structural Inference for Uncertain Networks

Methods

11Structural Inference for Uncertain Networks

• Equality condition of (11)

• EM algorithm, repeat:- E-step: Fix γ, w and find q(g) by (14)

- M-step: Find γ, w by maximize the right hand side of (11)

Could be use to detect communities

Page 12: 009_20150201_Structural Inference for Uncertain Networks

Methods

12Structural Inference for Uncertain Networks

• M-step: Maximum the right-hand side of (11)

Apply EM algorithm again to find optimal w

Page 13: 009_20150201_Structural Inference for Uncertain Networks

Methods

13Structural Inference for Uncertain Networks

• M-step: Update equations of parameters

Page 14: 009_20150201_Structural Inference for Uncertain Networks

Methods

14Structural Inference for Uncertain Networks

• Physical interpretation of t

The posterior probability that there is an edge between notes i and j, given that they are in groups r and s.

Page 15: 009_20150201_Structural Inference for Uncertain Networks

Methods

15Structural Inference for Uncertain Networks

• E-step: Compute q(g)

- It’s unpractical to compute denominator of eq. (14)

Approximate q(g) by importance sampling or MCMC

However, in this paper, they use “Belief Propagation” method

⌘i!jr

Message = the probability that node i below to community r if node j is removed from network

current best estimate

Page 16: 009_20150201_Structural Inference for Uncertain Networks

Belief propagation equation

16Structural Inference for Uncertain Networks

Our target q(g)

Two-node marginal prob

Solve by iterate to converge

Page 17: 009_20150201_Structural Inference for Uncertain Networks

Degree corrected stochastic block model

17Structural Inference for Uncertain Networks

- The stochastic block model gives poor performance for community detection in real-world problem (because the assumed model is Poisson degree distribution).

• Degree corrected stochastic block model

- Probability to place undirected edges between nodes i, j that fall into groups r, s is didjwrs

Page 18: 009_20150201_Structural Inference for Uncertain Networks

Result - synthetic network

18Structural Inference for Uncertain Networks

To satisfy e.q. (4)

The delta function makes the matrix Q of edge probabilities realistically sparse, in keeping with the structure of real-world data sets, with a fraction 1 − c of non-edges having exactly zero probability in the observed data, on average.

Page 19: 009_20150201_Structural Inference for Uncertain Networks

Result - synthetic network

19Structural Inference for Uncertain Networks

Page 20: 009_20150201_Structural Inference for Uncertain Networks

Result - protein interaction network

20Structural Inference for Uncertain Networks

Page 21: 009_20150201_Structural Inference for Uncertain Networks

Edge Recovery

21Structural Inference for Uncertain Networks

• Given the matrix Q of edge probabilities, can we make an informed guess about the adjacency matrix A?

- Simple approach: predict the edges with the highest probability

- Better approach: if we know that network has community structure, given two pairs of nodes with similar values of Qij, the pair that are in the same community should be more likely to be connected by an edge than the pair that are not

Compute in EM step

Page 22: 009_20150201_Structural Inference for Uncertain Networks

Edge Recovery

22Structural Inference for Uncertain Networks

Page 23: 009_20150201_Structural Inference for Uncertain Networks

Conclusion

23

- Analyze the networks represented by uncertain measurements of their edges

• Motivation

- Fitting a generative network model to the data using a combination of an EM algorithm and belief propagation

• Proposal

- Reconstruct underlaying structure of network (community detection, edge recovery, …)

• Applications

Structural Inference for Uncertain Networks