Scalable and Parallelizable Processing of Influence Maximization for Large-Scale Social Networks

Scalable and Parallelizable Processing of Influence Maximization

for Large-Scale Social Networksfor Large-Scale Social Networks

Apr 9, 2013Jinha Kim, Seung-Keol Kim, Hwanjo Yu

Pohang University of Science and Technology (POSTECH)

2

Goal

• Boosting Influence Maximization processingby efficient influence evaluation

4

Viral MarketingViral Marketing

Influence Maximization ProblemInfluence Maximization Problem

GraphGraph DiffusionDiffusionModelModel

ProcessingProcessingAlgorithmAlgorithm

5

Word of Mouth Effect

...

...

...

7

A Marketer’s Perspective

...

...

...PERSUADPERSUADEE

ONE!ONE!

Making Making Money!!!!Money!!!!

9

How to find in an algorithmic way?

10





11

Quantifying Influence

The expected number of users influenced by S

12

Influence Maximization Problem (KKT 03)

13





14

Abstracting Social Networks

15

Abstracting Social Network

uu vve

16





17

Quantifying Influence

The expected number of entities influenced by SDEPENDS ON

how influence is propagated through a graph

19

SEEDSSEEDS

Independent Cascade (IC) model

active

inactive

t = 0

20

Independent Cascade(IC) model

active at t = i

inactive

t = i + 1

active at t < i

21

Independent Cascade(IC) model

inactive

active at t < j

t = j + 1

Propagation ends!!!

22





24

Processing AlgorithmProcessing Algorithm

Macro LevelMacro LevelProcessingProcessing

Micro LevelMicro LevelProcessingProcessing

25




26

Macro Level (KKT 03)

• Finding the maximum from cases

• Reducible to set-covering problem (NP-Hard)

27

Greedy Algorithm (KKT 03)

• Repeatedly selects the node which gives the most marginal gain from

• and are two major evaluation components

28




29

Micro Level (CWW 10)

• Cannot count influence propagation routes between two nodes

30

Evaluating (S)σ• Monte-Carlo Simulation (KKT 03)

• Simultaneous simulation (CWY 09)

• Breaking down a graph into communities (WCS 10)

• Shortest path between two nodes (KS 06)

• Local arborescence based on the most probable path (CWW 10)

31


Macro LevelMacro LevelProcessingProcessing IPAIPA

33

Intuition

• How about extremely localizing influence??

• Influence path between two nodes as influence evaluation unit !!

• Considering all path is not tractable (#P-hard)

• Only considering meaningful influence paths

36

Meaningful Influence Path in IC model

vv11vv11 vv22vv22 vv33vv33 vv44vv44 vv55vv550.1 0.1 0.1 0.1

37

Traversing Graph

Graph A traversing tree from a

38

Extracting Paths

A traversing tree from a A path collection from a

39

Organizing Paths

A path collection from a

40

Approximating ({v})σInfluence of a node v

infl. of v to itself

Influence of a node v to u

41

Parallel evaluation

• To approximate ({v}), σPv V→ is required

• For v≠u, Pv V→ and Pu V → do not have common paths

• Independent evaluation of ({v}) is guaranttedσ

vv11vv11

uu1111uu1111

uu1n1nuu1n1n

. . .

. . .

vv22vv22

uu2121uu2121

uu2n2nuu2n2n

. . .

. . .

42

Re-organizing

• Changing perspective from starting nodes to ending nodes

43

• ({v}) ≠ (S {v}) - (S)σ σ ∪ σ

• influence blocking!!!!

• v blocks a path from u S∈

• We should detect blocked(invalid) paths

Approximating (S {v}) - (S)σ ∪ σis not trivialis not trivial

uuuu vvvv

uuuu vvvv

before

after

44

Detecting influence blocking

• Current seed set : S

• New seed node : v

• Valid Paths

uuuu vvvv

vvvv uuuu

45

Adding a seed node

46

Detect invalid paths

47

Approximating (S {v}) - (S)σ ∪ σ(S {v}) - (S)σ ∪ σMarginal infl. of a node v

infl. of v to itself

Infl. of seeds S to a node v

Only consider valid paths

51

Empirical EvaluationEmpirical Evaluation

52

Dataset

53

Algorithms

• Monte-Carlo[Greedy] (LKG 07)

• PMIA (CWW 10)

• SD (single discount)

• Random (baseline)

• IPA

54

Finding Threshold

55

Processing Time

57

Influence

58

Influence

59

Influence

60

Parallelization Effect

61

Q & A

62

References

63

• KKT 03 : Kempe, D., Kleinberg, J., and Tardos, E. Maximizing the spread of influence through a social network. (KDD ’03)

• SC 06 : Kimura, M., and Saito, K. Tractable models for information diffusion in social networks. (PKDD ’06)

• LKG 07 : Leskovec, J., Krause, A., Guestrin, C., Faloutsos, C., VanBriesen, J., and Glance, N. Cost-effective outbreak detection in networks. (KDD ’07)

• CWY 09 : Chen, W., Wang, Y., and Yang, S. Efficient influence maximization in social networks.(KDD ’09)

64

• CWW 10 : Chen, W., Wang, C., and Wang, Y. Scalable influence maximization for prevalent viral marketing in large-scale social networks.(KDD ’10)

• WCS 10 : Wang, Y., Cong, G., Song, G., and Xie, K. Community-based greedy algorithm for mining top- k influential nodes in mobile social networks.(KDD ’10)

• JSC 11 : Jiang, Q., Song, G., and Cong, G., Simulated Annealing Based Influence Maximization in Social Networks.(AAAI ’11)

• LYK 12 : Lee, W., Kim, J., and Yu, H., CT-IC: Continuously activated and Time-restricted Independent Cascade Model for Viral Marketing(ICDM ’12)

Scalable and Parallelizable Processing of Influence Maximization for Large-Scale Social Networks

Technology

Transcript of Scalable and Parallelizable Processing of Influence Maximization for Large-Scale Social Networks