Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami...

35
Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi [email protected] 4 th September, 2012

Transcript of Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami...

Page 1: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

Diffusion-Aware Sampling and Estimation in Information Diffusion

Networks

Diffusion-Aware Sampling and Estimation in Information Diffusion

Networks

By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi

[email protected]

4th September, 2012

Page 2: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

2 DMLP2P Live Video Streaming DMLDML2

Outline Introduction Related Work

Complete Data Partial Data

Problem Formulation Proposed Framework

Sampling Design Estimation Approach

Experimental Evaluation Conclusion References

Diffusion-Aware Sampling and Estimation

Page 3: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

3 DMLP2P Live Video Streaming DMLDML3

Introduction

Information Diffusion

One of the important topics in large On-line Social Networks (OSN) such as Facebook, Twitter, and YouTube.

Has a latent structure which makes its analysis considerably difficult

Although we usually discover the time of obtaining some information by people, we can not find the source of information easily.

Using partially-observed network data

Diffusion-Aware Sampling and Estimation

Page 4: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

4 DMLP2P Live Video Streaming DMLDML4

Introduction

Measurement of the network characteristicsSampling

-Collecting the data by a sampling method

- Considering the visiting probabilities of network

elements

Estimation- Using an estimator to

obtain the network characteristics

Diffusion-Aware Sampling and Estimation

Page 5: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

5 DMLP2P Live Video Streaming DMLDML5

Different Sampling Designs in Diffusion Networks

Considering the sampling approaches to study diffusion behaviors of social networks, apart from their topologies

Figure 1-(a): Using well-known sampling methods such as BFS and RW without considering the behavior of the diffusion process gathering redundant data + losing parts of diffusion data + decrease the performance of these sampling methods

Figure 1-(b): Working with diffusion network directly It is often not feasible to work with this latent network

Diffusion-Aware Sampling and Estimation

Page 6: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

6 DMLP2P Live Video Streaming DMLDML6

Introduction

A link-tracing based sampling method which utilizes

diffusion process properties to traverse the network more

accurately.Only uses the infection times as local information without

any knowledge about the latent structure of the

diffusion networkExtend the well-known

Hansen-Hurwitz estimator to correct the bias of sampled

data by three efficient estimators of characteristics:

(1) link-based (2) node-based (3) cascade-based

The First attempt to introduce a complete measurement

framework for the diffusion networks

Proposing a Novel Two-Step Framework

(Sampling/ Estimation)»DNS«

Diffusion-Aware Sampling and Estimation

Page 7: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

Related WorkRelated Work

Complete Data

Partial Data

Page 8: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

8 DMLP2P Live Video Streaming DMLDMLDiffusion-Aware Sampling and Estimation

8

Complete Data The most fundamental approach: Collecting the complete

diffusion data Following the Iraq war petitions in the format of e-mail [11],

[19] studying communication events between faculty and staff of a

university by e-mails [15] tracking the flow of information by extracting short textual

phrases [16]

missing data privacy policiesdiffusion network latent structure

Use sampling methods to obtain

partial data

Page 9: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

9 DMLP2P Live Video Streaming DMLDMLDiffusion-Aware Sampling and Estimation

9

Partial Data

Page 10: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

Problem FormulationProblem Formulation Basic Notations and Definitions Problem Definition

Page 11: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

11

DMLP2P Live Video Streaming DMLDMLDiffusion-Aware Sampling and Estimation

11

Basic Notations & Definitions

G = (V, E) The underlying network (n=|V| & m=|E|)

Infection

some diffusible chunks such as information and epidemic diseases which propagate over G

Cascade

each path of infection will build a “cascade”

Diffusion Network

When the cascades spread over the underlying network, the diffusion network G* will be formed.

Gs = (Vs,Es)

the induced sub-graph of G by sampling rate of μ

Page 12: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

12

DMLP2P Live Video Streaming DMLDMLDiffusion-Aware Sampling and Estimation

12

Diffusion Characteristics Measurement

Element Set T: A set of diffusion network elements such as nodes, links and cascades

L: a finite set of element labels

• Examples of a label: the degree of a node, the weight of a link, or the length of a cascade.

Target function f: T L⇾ : A label le is assigned to each element e T ∊ i.e. f = {(e, le)| e T, l∊ e L∊ }

• Infection as an example

To measure diffusion network characteristics, some aggregative function should be used.

• Average Function

Page 13: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

13

DMLP2P Live Video Streaming DMLDMLDiffusion-Aware Sampling and Estimation

13

Problem Definition

Proposing a diffusion-aware measurement

framework

• samples from the underlying network and computes the desired target function f

• computes an estimate of Avg(f) by finding an appropriate estimator M.

Evaluation: Bias Metric

Final Goal • Finding a measurement framework which minimizes the bias ( )⍴

Page 14: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

Proposed FrameworkProposed Framework

DNS: Diffusion Network Sampling

1) Sampling Design2) Estimation Approach

Page 15: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

15

DMLP2P Live Video Streaming DMLDMLDiffusion-Aware Sampling and Estimation

15

Sampling Design

Existing sampling methods such as BFS & RW do not consider diffusion paths.

Page 16: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

16

DMLP2P Live Video Streaming DMLDMLDiffusion-Aware Sampling and Estimation

16

Sampling Design (cont’d)

Each cascade c can be assigned to a time vector tc = {t1, t2, …, tn} which shows the infection times

of nodes by cascade cCT = {t1, t2, …, tNc}: the set of cascades’ time vectors

Waiting time model[1] depends on = tΔ v – tu

Ce: set of cascades which pass over link e

The infection probability of link e

( )P e

Exponential Model

( )

| |cc C

ee

P eP

C

Page 17: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

17

DMLP2P Live Video Streaming DMLDMLDiffusion-Aware Sampling and Estimation

17

The Algorithm

moving from a node u, to a neighboring node v, through an outgoing link with the infection probability

Using the infection times (i.e. CT) as local information without any prior knowledge about the latent structure of the diffusion network

Page 18: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

18

DMLP2P Live Video Streaming DMLDMLDiffusion-Aware Sampling and Estimation

18

Estimation Approach

Correcting the selection bias of a sampling method by re-weighting of the measured values

Hansen-Hurwitz Estimator: elements are weighted inversely proportional to their visiting probability

Xi and п(Xi) are the visited element and its visiting probability on the ith draw of sampling method.

1

01

0

( )

( )ˆ

1( )

ki

i ik

i i

f x

x

x

Page 19: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

19

DMLP2P Live Video Streaming DMLDMLDiffusion-Aware Sampling and Estimation

19

Estimation Approach

Characteristics

Link-based Node-basedCascade-

based

To use this estimator, the probability of visiting each element in the proposed sampling procedure should be

computed.

Page 20: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

20

DMLP2P Live Video Streaming DMLDMLDiffusion-Aware Sampling and Estimation

20

Link-based Characteristics

Gaining some information without having any connection to others for propagation will be not valuable in a network.

Link Attendance: shows the amount of presence in diffusion process for a link

Moving over links with the probability of Pe

п(e) = Pe

Only Using local knowledge to compute the visiting probabilities of the links.

Page 21: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

21

DMLP2P Live Video Streaming DMLDMLDiffusion-Aware Sampling and Estimation

21

Node-based Characteristics

Examples: The number of “Seeds” (the beginners of an

infection) [3] and “participation” (the fraction of

users involved in the information diffusion) [12]Modeling the diffusion process

as a Markov random walk over the underlying network G [18]

Where N (u) is the set of node u’ neighbors

Calculating п = {п(0), п(1),…, п(n-1)} needs the global

knowledge of a network (the stationary distribution of the

mentioned Markov chain)Not having the global view of

the network in sampling procedure

Finding the exact value of п is not possible in real systems.

Page 22: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

22

DMLP2P Live Video Streaming DMLDMLDiffusion-Aware Sampling and Estimation

22

Cascade-based Characteristics

Cascades: the building blocks of a diffusion network

D

e

p

t

h

o

f a

s

p

rea

d

i

n

g

p

h

e

n

o

m

e

n

o

n

ca

n

b

e

d

eter

m

i

n

e

d

b

y

t

h

e

le

n

gt

h

o

f its

casca

d

es

A cascade visiting probability depends on visiting all of its links:

Calculating п(c) needs having all links of cascade c which is a global knowledge.

Page 23: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

23

Experimental EvaluationExperimental Evaluation

• Setup• Dataset• Performance Evaluation•Diffusion Behavior Analysis

Page 24: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

24

DMLP2P Live Video Streaming DMLDMLDiffusion-Aware Sampling and Estimation

24

Setup

Characteristic: Link-Attendance

Baseline methods: BFS and RW

⍺: Control the speed of cascade transmission

β: control the distance through which a cascade can propagates

Diffusion rate (δ): The fraction of the underlying network G which is covered by the diffusion network G*

Page 25: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

25

DMLP2P Live Video Streaming DMLDMLDiffusion-Aware Sampling and Estimation

25

Dataset

Synthetic NetworksReal Networks

Table 1: The network and cascade generation parameters

Page 26: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

26

DMLP2P Live Video Streaming DMLDMLDiffusion-Aware Sampling and Estimation

26

Speed of Cascade

The unknown structure of the diffusion network structure unavailability of the cascade speed

Figure 2: Speed of cascade effect of link-attendance bias

0.4 ≤⍺≤ 0.7

Page 27: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

27

DMLP2P Live Video Streaming DMLDMLDiffusion-Aware Sampling and Estimation

27

Performance Evaluation

Figure 3: Link Attendance characteristic evaluation in different sampling rates

Page 28: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

28

DMLP2P Live Video Streaming DMLDMLDiffusion-Aware Sampling and Estimation

28

Performance Evaluation

Decreasing the bias by 30% in average by the DNS framework in comparison to the sampling design without estimation approach

Table 2: The average performance difference of DNS with BFS, RW & DNS-WoE

Page 29: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

29

DMLP2P Live Video Streaming DMLDMLDiffusion-Aware Sampling and Estimation

29

Diffusion Behavior Analysis

Figure 4: Analysis of diffusion rate over sampling frameworks

Page 30: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

30

DMLP2P Live Video Streaming DMLDMLDiffusion Network Extraction30

Contributions

Proposing a novel sampling design for gathering data from a diffusion network by utilizing the properties of diffusion process.

Pr

oposi

ng t

hree esti

mat

ors f

or c

orrecti

ng t

he

bias

of sa

mple

d

data: li

nk-

base

d,

node-

base

d, a

nd casca

de-

base

d c

haracteristics

Decreasi

ng t

he

bias

of

meas

uri

ng li

nk-

base

d c

haracteristics c

ompare

d t

o t

he

ot

her c

ommon sa

mpli

ng

met

hods

DNS Framewor

k

Page 31: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

31

DMLP2P Live Video Streaming DMLDMLDiffusion Network Extraction31

Conclusion & Future Work…

The proposed method outperforms BFS and RW in terms of link-attendance by about 37% and 35%, respectively

The proposed estimator can improve the performance of the sampling design by about 30%

DNS can act very well even in low sampling rates

DNS leads to low bias even in low diffusion rates.

Page 32: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

32

DMLP2P Live Video Streaming DMLDMLExtracting Network of Information32

References[1] M. Gomez-Rodriguez, J. Leskovec and A. Krause, Inferring networks of diffusion and influence, In proc. of KDD ’10, pages 1019-1028, 2010.

[2] Twitter Blog: = numbers. Blog.twitter.com., Retrieved 2012-01-20, http://blog.twitter.com/2011/03/numbers.html. ̸�

[3] M. Eslami, H.R. Rabiee and M.Salehi, Sampling from Information Diffusion Networks, 2012.

[4] M. Gjoka, M. Kurant, C. T. Butts and A. Markopoulou, Walking in Facebook: A Case Study of Unbiased Sampling of OSNs, Proceedings

of IEEE INFOCOM, 2010.

[5] M. Gjoka, M. Kurant, C. T. Butts and A. Markopoulou, Practical Recommendations on Crawling Online Social Networks, IEEE J. Sel.

Areas Commun, 2011.

[6] M. Salehi, H. R. Rabiee, N. Nabavi and Sh. Pooya, Characterizing Twitter with Respondent-Driven Sampling, International Workshop on Cloud and

Social Networking (CSN2011) in conjunction with SCA2011, No. 9, Vol. 29, pages 5521–5529, 2011.

[7] A. Mislove, M. Marcon, K. P. Gummadi, P. Druschel and B. Bhattachar-jee, Measurement and analysis of online social networks, Proceedings of

the ACM SIGCOMM conference on Internet measurement, pages 29–42, 2007.

[8] J. Leskovec and Ch. Faloutsos, Sampling from large graphs, Proceedings of the ACM SIGKDD conference on Knowledge discovery and data

mining, pages 631–636, 2006.

[9] M. Salehi, H. R. Rabiee, and A. Rajabi, Sampling from Complex Networks with high Community Structures, Chaos: An Interdisciplinary Journal of

Nonlinear Science , 2012.

[10] J. Leskovec, M. McGlohon, C. Faloutsos, N. S. Glance and M. Hurst, Patterns of Cascading Behavior in Large Blog Graphs, In proc. of

SDM’07, 2007.

[11] D. Liben-Nowell and J. Kleinberg, Tracing information flow on a global scale using Internet chain-letter data, Proc. of the National Academy of

Sciences, 105(12):4633-4638, 25 Mar, 2008.

[12] M. D. Choudhury, Y. Lin, H. Sundaram, K. S. Candan, L. Xie and A.Kelliher, How Does the Data Sampling Strategy Impact the Discovery of

Information Diffusion in Social Media?, Proc. of ICWSM , 2010.

Page 33: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

33

DMLP2P Live Video Streaming DMLDMLExtracting Network of Information33

References(cont’d)[13] E. Sadikov, M. Medina, J. Leskovec and H. Garcia-Molina, Correcting for missing data in information cascades, WSDM, pages 55-64, 2011.

[14] D. Gruhl, R. Guha, D. Liben-Nowell and A. Tomkins, Information dif-fusion through blogspace, In proc. of of the 13th international conference

on World Wide Web, pages 491–501, 2004.

[15] G. Kossinets, J. M. Kleinberg and D.J. Watts, The structure of information pathways in a social communication network, KDD ’08,

pages 435-443. 2008.

[16] J. Leskovec, L. Backstrom and J. Kleinberg, Meme-tracking and the dynamics of the news cycle, KDD ’09: Proc. of the 15th ACM SIGKDD

international conference on Knowledge discovery and data mining, pages 497-506, 2009.

[17] M.Gomez-Rodriguez, D. Balduzzi and B.Scholkopf, Uncovering the Temporal Dynamics of Diffusion Networks, Proc. of the 28 th International

Conference on Machine Learning, Bellevue,WA, USA, 2011.

[18] M. Eslami, H. R. Rabiee and M. Salehi, DNE: A Method for Extracting Cascaded Diffusion Networks from Social Networks, IEEE Social Computing Proceedings, 2011.

[19] F. Chierichetti, J. Kleinberg and D. Liben-Nowell, Reconstructing Pat-terns of Information Diffusion from Incomplete Observations, NIPS 2011.

[20] C.X. Lin, Q. Mei, Y.Jiang, J. Han and S. Qi, Inferring the Diffusion and Evolution of Topics in Social Communities, SNA KDD, 2011.

[21] Ch. Wilson, B. Boe, A. Sala, K. P. N Puttaswamy and B. Y. Zhao, User interactions in social networks and their implications, EuroSys ’09:

Proceedings of the 4th ACM European conference on Computer systems, pages 205–218, 2009.

[22] M. Kurant, A. Markopoulou and P. Thiran, On the bias of BFS (Breadth First Search), 22nd IEEE International Teletraffic Congress (ITC), pages

1–8, 2010.

[23] L. Lovas, Random walks on graphs: a survey, Combinatorics, 1993.

[24] M.R Henzinger, A. Heydon,M. Mitzenmacher and M. Najork, On near-uniform URL sampling, Proceedings of the World Wide Web conference

on Computer networks, pages 295-308, 2000.

[25] J. Leskovec and C. Faloutsos, Scalable modeling of real graphs using Kronecker multiplication, Proc. of ICML, pages 497-504, 2007.

Page 34: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

34

DMLP2P Live Video Streaming DMLDMLExtracting Network of Information34

References(cont’d)[26] J. Leskovec, J. Kleinberg and C. Faloutsos, Graphs over Time: Densi-cation Laws, Shrinking Diameters and Possible Explanations, Proc. Of KDD ,

2005.

[27] P.Erds and A. Rnyi, On the evolution of random graphs, Publ. Math. Inst. Hung. Acad. Sci., 5: page 17, 1960.

[28] A. Clauset, C. Moore and M. E. J. Newman, Hierarchical structure and the prediction of missing links in networks, Nature, 453: pages 98-101, 2008.

[29] J. Leskovec, K.J. Lang, A. Dasgupta and M.W. Mahoney, Statistical properties of community structure in large social and information networks, WWW, pages 695-704, 2008.

[30] L.A. Adamic and N. Glance, The political blogosphere and the 2004 US Election, Proc. of the WWW-2005 Workshop on the Weblogging Ecosystem, 2005.

[31] M. E. J. Newman, Finding community structure in networks using the eigenvectors of matrices, Preprint physics/0605087, 2006.

[32] S.A. Myers and J. Leskovec, On the Convexity of Latent Social Network Inference, Advances in Neural Infromation Processing Systems, 2010.

[33] Ch. Gkantsidis, M. Mihail and A. Saberi, Random walks in peer-to-peer networks: algorithms and evaluation, Elsevier Science Publishers B. V., Performance Evaluation, P2P Computing Systems, Vol 63, pages 241–263, 2006.

[34] D. Stutzbach, R. Rejaie, N. Duffield,S. Sen and W. Willinger, On Unbiased Sampling for Unstructured Peer-to-Peer Networks, Proceedings of IMC,pages 27–40, 2008.

[35] L. Becchetti, C. Castillo, D. Donato and A.Fazzone, On the bias of BFS (Breadth First Search), LinkKDD, pages 1–8, 2006.

[36] J. Yang and J. Leskovec, Modeling Information Diffusion in Implicit Networks, ICDM, IEEE Computer Society, pages 599-608, 2010.

[37] D. Kempe, J. Kleinberg and E. Tardos, Maximizing the spread of influence through a social network, KDD ’03: Proc. of the ninth ACM

SIGKDD international conference on Knowledge discovery and data mining, ACM Press, pages 137-146, 2003.

[38] M. Hansen and W. Hurwitz, On the Theory of Sampling from Finite Populations, Annals of Mathematical Statistics, No. 3, Vol 14, 1943.

[39] E. Volz and D. Heckathorn, Probability based estimation theory for respondent-driven sampling, Official Statistics, pages 79, Vol 24, 2008.

[40] M. Girvan and M. E. J. Newman, Community structure in social and biological networks, Proc. Natl. Acad. Sci. USA 99, pages 7821-7826,

2002.

Page 35: Diffusion-Aware Sampling and Estimation in Information Diffusion Networks By: Motahareh Eslami Mehdiabadi, Hamid Reza Rabiee and Mostafa Salehi eslami@ce.sharif.edu.

Q&AQ&A