Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum,...

31
Wei Chen 陈卫 Microsoft Research Asia Influence diffusion dynamics and influence maximization in complex social networks 1 CNCC'12 Social Computing Forum, Oct. 18, 2012

Transcript of Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum,...

Page 1: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Wei Chen

陈卫

Microsoft Research Asia

Influence diffusion dynamics and influence maximization in complex social networks

1 CNCC'12 Social Computing Forum, Oct. 18, 2012

Page 2: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Social influence (人际影响力)

CNCC'12 Social Computing Forum, Oct. 18, 2012 2

Social influence occurs when one's emotions, opinions, or behaviors are affected by others.

Social influence is when the actions or thoughts of individual(s) are changed by other individual(s).

Page 3: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

CNCC'12 Social Computing Forum, Oct. 18, 2012 3

Page 4: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

CNCC'12 Social Computing Forum, Oct. 18, 2012 4

Page 5: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

CNCC'12 Social Computing Forum, Oct. 18, 2012 5

Page 6: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

CNCC'12 Social Computing Forum, Oct. 18, 2012 6

Page 7: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

CNCC'12 Social Computing Forum, Oct. 18, 2012 7

[Christakis and Fowler, NEJM’07,08]

Page 8: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Booming of online social networks

8 CNCC'12 Social Computing Forum, Oct. 18, 2012

Page 9: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Opportunities on online social influence research and applications

CNCC'12 Social Computing Forum, Oct. 18, 2012 9

massive data set, real time, dynamic, open

help social scientists to understand social interactions, influence, and their diffusion in grand scale

help identifying influencers

help health care, business, political, and economic decision making

Page 10: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Examples of recent studies

CNCC'12 Social Computing Forum, Oct. 18, 2012 10

Influential and susceptible members of social networks [Aral & Walker, Science’2012]

installing a facebook app and automatic notification to friends

men are 49% more influential than women, women influence men 46% more than they influence other women

younger users are more susceptible than older users

influential people are likely to be clustered

Voting mobilization [Bond et al, Nature’2012]

show a facebook msg. on voting day with faces of friends who voted

generate 340K additional votes due to this message, among 60M people tested

Page 11: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Challenges on the research of online social networks

CNCC'12 Social Computing Forum, Oct. 18, 2012 11

Data mining and modeling: mining large-scale social network data, and building realistic models of social influence diffusion patterns, both macro level and micro level

Algorithm: Scalable algorithm design on influence computation, influencer ranking, and influence maximization

System: Graph-based data storage and processing, for both offline and online data analysis

Page 12: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Outline of this talk

CNCC'12 Social Computing Forum, Oct. 18, 2012 12

Scalable influence maximization

Competitive influence dynamics and influence blocking maximization

Page 13: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

[KDD’09, KDD’10, ICDM’10, AAAI’12, ICDM’12] Collaborators:

Yajun Wang, Siyu Yang, Chi Wang, Yifei Yuan, Li Zhang, Wei Lu, Ning Zhang, Kyomin Jung,

Wooram Heo

CNCC'12 Social Computing Forum, Oct. 18, 2012 13

Scalable Influence Maximization in Social Networks

Page 14: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Word-of-mouth (WoM) effect in social networks

Word-of-mouth effect is believed to be a promising advertising strategy.

Increasing popularity of online social networks may enable large scale WoM marketing

xphone is good

xphone is good

xphone is good

xphone is good

xphone is good

xphone is good

xphone is good

CNCC'12 Social Computing Forum, Oct. 18, 2012 14

Page 15: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

The Problem of Influence Maximization

CNCC'12 Social Computing Forum, Oct. 18, 2012 15

Social influence graph vertices are individuals links are social relationships number p(u,v) on a directed

link from u to v is the probability that v is activated by u after u is activated

Independent cascade model initially some seed nodes are

activated At each step, each newly

activated node u activates its neighbor v with probability p(u,v)

Influence maximization: find k seeds that generate the

largest expected influence

0.3

0.1

Page 16: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Prior work

CNCC'12 Social Computing Forum, Oct. 18, 2012 16

Influence maximization as a discrete optimization problem proposed by Kempe, Kleinberg, and Tardos, 2003

Introduce Independent Cascade (IC) and Linear Threshold (LT) models

Finding optimal solution is provably hard (NP-hard)

Greedy approximation algorithm, 63% approximation of the optimal solution (based on submodularity) select k seeds in k iterations

in each iteration, select one seed that provides the largest marginal increase in influence spread

Several subsequent studies improved the running time

Serious drawback:

very slow, not scalable: > 3 hrs on a 30k node graph for 50 seeds

Page 17: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Our work

CNCC'12 Social Computing Forum, Oct. 18, 2012 17

Exact influence computation is #P hard, for both IC and LT models --- computation bottleneck [KDD’10, ICDM’10]

Design new heuristics

MIA for general IC model [KDD’10] 103 speedup --- from hours to seconds

influence spread close to that of the greedy algorithm of [KKT’03]

Degree discount heuristic for uniform IC model [KDD’09] 106 speedup --- from hours to milliseconds

LDAG for LT model [ICDM’10] 103 speedup --- from hours to seconds

IRIE for IC model [ICDM’12] further improvement with time and space savings

Extend to time-critical influence maximization [AAAI’12]

Page 18: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Features of Maximum Influence Arborescence (MIA) heuristic

CNCC'12 Social Computing Forum, Oct. 18, 2012 18

Based on greedy approach

Localize computation

Use local tree structure

easy to compute

linear batch update on marginal influence spread

0.3

0.1

𝑢

𝑣

Page 19: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Experiment results on MIA heuristic

19 CNCC'12 Social Computing Forum, Oct. 18, 2012

103 times speed up

close to Greedy, 49% better than Degree, 15% better than DegreeDiscount

Experiment setup: • 35k nodes from coauthorship graph in physics archive • influence probability to a node v = 1 / (# of neighbors of v) • running time is for selecting 50 seeds

Influence spread vs. seed set size running time

Page 20: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Summary

CNCC'12 Social Computing Forum, Oct. 18, 2012 20

Scalable influence maximization algorithms

MixedGreedy and DegreeDiscount [KDD’09]

PMIA for the IC model [KDD’10]

LDAG for the LT model [ICDM’10]

IRIE for the IC model [ICDM’12]: further savings on time and space

MIA-M for IC-M model [AAAI’12]: include time delay and maximization within a short deadline

PMIA/LDAG have become state-of-the-art benchmark algorithms for influence maximization

Page 21: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

[SDM’11, SDM’12, others under submission] Alex Collins, Rachel Cummings, Te Ke, Zhenming Liu,

David Rincon, Xiaorui Sun, Yajun Wang, Wei Wei, Yifei Yuan, Xinran He, Guojie Song, Qingye Jiang, Yanhua Li,

Zhi-Li Zhang

CNCC'12 Social Computing Forum, Oct. 18, 2012 21

Competitive Influence

Page 22: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Competitive influence diffusion Exogenous competition: rival products compete for

social influence in the social network CLT model and CLDAG algorithm for influence blocking

maximization [SDM’12]

Endogenous competition: bad opinions about a product due to product defect competes with positive opinions IC-N model and MIA-N algorithm [SDM’11]

Influence diffusion in networks with positive and negative relationships voter model in signed networks with exact inf. max.

algorithm

CNCC'12 Social Computing Forum, Oct. 18, 2012 22

Page 23: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Exogenous competition

23

Competitive linear threshold model positive and negative influence each follows LT

model

when competing on a node at the same step, negative influence wins with a fixed probability

Influence blocking maximization Given the negative activation status

find 𝑘 positive seeds

minimize the further negative influence, or maximize the expected number of “saved” or “blocked” nodes from negative influence --- negative influence reduction

application: rumor control

CNCC'12 Social Computing Forum, Oct. 18, 2012

Page 24: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Influence blocking maximization under CLT

CNCC'12 Social Computing Forum, Oct. 18, 2012 24

Negative influence reduction is submodular

Allows greedy approximation algorithm

Fast heuristic CLDAG: reduce influence computation on local DAGs

use dynamic programming for LDAG computations

Page 25: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Performance of the CLDAG

CNCC'12 Social Computing Forum, Oct. 18, 2012 25

• with Greedy algorithm • 1000 node sampled from a mobile

network dataset • 50 negative seeds with max degrees

• without Greedy algorithm • 15K node NetHEPT, collaboration

network in arxiv • 50 negative seeds with max degrees

Page 26: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Scalability—Real dataset

CNCC'12 Social Computing Forum, Oct. 18, 2012 26

Scalability Result for subgraph with greedy algorithm

Page 27: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Ongoing and future research directions

CNCC'12 Social Computing Forum, Oct. 18, 2012 27

Model validation and influence analysis from real data

Even faster heuristic algorithms

Fast approximation algorithms

Online and adaptive algorithms

Game theoretic settings for competitive diffusion

Incentives for information / influence diffusions

Page 28: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Grand challenge

CNCC'12 Social Computing Forum, Oct. 18, 2012 28

Understand the true viral diffusion scenarios, online and offline

Apply social influence research to explain, predict, and control viral phenomena

New focus of network science in the next decade

Page 29: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Acknowledgments

CUHK, Aug. 29, 2012 29

Siyu Yang PhD student

Princeton

Yajun Wang MSRA

Chi Wang PhD Student

UIUC

Yifei Yuan PhD student

UPenn

Li Zhang MSR-SVC

Rachel Cummings PhD student

Northwestern U.

Alex Collins Google

Te (Tony) Ke PhD student UC Berkeley

Xiaorui Sun PhD student

Columbia

Zhenming Liu post-doc Princeton

Page 30: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Acknowledgments

CUHK, Aug. 29, 2012 30

Wei Wei PhD student

CMU

Guojie Song Peking U.

Xinran He PhD student

USC

David Rincon Universitat Politècnica

de Catalunya

Wei Lu PhD Student

UBC

Ning Zhang Purdue U.

Qingye Jiang Master student

Columbia U.

Yanhua Li PhD student

U. of Minnisoda

Zhili Zhang U. of Minnisoda

Kyomin Jung KAIST

Page 31: Influence diffusion dynamics and influence maximization in ......11 CNCC'12 Social Computing Forum, Oct. 18, 2012 Data mining and modeling: mining large-scale social network data,

Questions?

Additional materials on my homepage:

Search “Wei Chen Microsoft”

• KDD’12 tutorial on influence spread in social networks

• my papers

CNCC'12 Social Computing Forum, Oct. 18, 2012 31