Scalable and Parallelizable Processing of Influence Maximization for Large-Scale Social Networks
-
Upload
jinha-kim -
Category
Technology
-
view
243 -
download
1
description
Transcript of Scalable and Parallelizable Processing of Influence Maximization for Large-Scale Social Networks
Scalable and Parallelizable Processing of Influence Maximization
for Large-Scale Social Networksfor Large-Scale Social Networks
Apr 9, 2013Jinha Kim, Seung-Keol Kim, Hwanjo Yu
Pohang University of Science and Technology (POSTECH)
2
Goal
• Boosting Influence Maximization processingby efficient influence evaluation
3
4
Viral MarketingViral Marketing
Influence Maximization ProblemInfluence Maximization Problem
GraphGraph DiffusionDiffusionModelModel
ProcessingProcessingAlgorithmAlgorithm
5
Word of Mouth Effect
...
...
...
7
A Marketer’s Perspective
...
...
...PERSUADPERSUADEE
ONE!ONE!
Making Making Money!!!!Money!!!!
9
How to find in an algorithmic way?
10
Viral MarketingViral Marketing
Influence Maximization ProblemInfluence Maximization Problem
GraphGraph DiffusionDiffusionModelModel
ProcessingProcessingAlgorithmAlgorithm
11
Quantifying Influence
The expected number of users influenced by S
12
Influence Maximization Problem (KKT 03)
13
Viral MarketingViral Marketing
Influence Maximization ProblemInfluence Maximization Problem
GraphGraph DiffusionDiffusionModelModel
ProcessingProcessingAlgorithmAlgorithm
14
Abstracting Social Networks
15
Abstracting Social Network
uu vve
16
Viral MarketingViral Marketing
Influence Maximization ProblemInfluence Maximization Problem
GraphGraph DiffusionDiffusionModelModel
ProcessingProcessingAlgorithmAlgorithm
17
Quantifying Influence
The expected number of entities influenced by SDEPENDS ON
how influence is propagated through a graph
19
SEEDSSEEDS
Independent Cascade (IC) model
active
inactive
t = 0
20
Independent Cascade(IC) model
active at t = i
inactive
t = i + 1
active at t < i
21
Independent Cascade(IC) model
inactive
active at t < j
t = j + 1
Propagation ends!!!
22
Viral MarketingViral Marketing
Influence Maximization ProblemInfluence Maximization Problem
GraphGraph DiffusionDiffusionModelModel
ProcessingProcessingAlgorithmAlgorithm
24
Processing AlgorithmProcessing Algorithm
Macro LevelMacro LevelProcessingProcessing
Micro LevelMicro LevelProcessingProcessing
25
Processing AlgorithmProcessing Algorithm
Macro LevelMacro LevelProcessingProcessing
Micro LevelMicro LevelProcessingProcessing
26
Macro Level (KKT 03)
• Finding the maximum from cases
• Reducible to set-covering problem (NP-Hard)
27
Greedy Algorithm (KKT 03)
• Repeatedly selects the node which gives the most marginal gain from
• and are two major evaluation components
28
Processing AlgorithmProcessing Algorithm
Macro LevelMacro LevelProcessingProcessing
Micro LevelMicro LevelProcessingProcessing
29
Micro Level (CWW 10)
• Cannot count influence propagation routes between two nodes
30
Evaluating (S)σ• Monte-Carlo Simulation (KKT 03)
• Simultaneous simulation (CWY 09)
• Breaking down a graph into communities (WCS 10)
• Shortest path between two nodes (KS 06)
• Local arborescence based on the most probable path (CWW 10)
31
Processing AlgorithmProcessing Algorithm
Macro LevelMacro LevelProcessingProcessing IPAIPA
33
Intuition
• How about extremely localizing influence??
• Influence path between two nodes as influence evaluation unit !!
• Considering all path is not tractable (#P-hard)
• Only considering meaningful influence paths
36
Meaningful Influence Path in IC model
vv11vv11 vv22vv22 vv33vv33 vv44vv44 vv55vv550.1 0.1 0.1 0.1
37
Traversing Graph
Graph A traversing tree from a
38
Extracting Paths
A traversing tree from a A path collection from a
39
Organizing Paths
A path collection from a
40
Approximating ({v})σInfluence of a node v
infl. of v to itself
Influence of a node v to u
41
Parallel evaluation
• To approximate ({v}), σPv V→ is required
• For v≠u, Pv V→ and Pu V → do not have common paths
• Independent evaluation of ({v}) is guaranttedσ
vv11vv11
uu1111uu1111
uu1n1nuu1n1n
. . .
. . .
vv22vv22
uu2121uu2121
uu2n2nuu2n2n
. . .
. . .
42
Re-organizing
• Changing perspective from starting nodes to ending nodes
43
• ({v}) ≠ (S {v}) - (S)σ σ ∪ σ
• influence blocking!!!!
• v blocks a path from u S∈
• We should detect blocked(invalid) paths
Approximating (S {v}) - (S)σ ∪ σis not trivialis not trivial
uuuu vvvv
uuuu vvvv
before
after
44
Detecting influence blocking
• Current seed set : S
• New seed node : v
• Valid Paths
uuuu vvvv
vvvv uuuu
45
Adding a seed node
46
Detect invalid paths
47
Approximating (S {v}) - (S)σ ∪ σ(S {v}) - (S)σ ∪ σMarginal infl. of a node v
infl. of v to itself
Infl. of seeds S to a node v
Only consider valid paths
51
Empirical EvaluationEmpirical Evaluation
52
Dataset
53
Algorithms
• Monte-Carlo[Greedy] (LKG 07)
• PMIA (CWW 10)
• SD (single discount)
• Random (baseline)
• IPA
54
Finding Threshold
55
Processing Time
57
Influence
58
Influence
59
Influence
60
Parallelization Effect
61
Q & A
62
References
63
• KKT 03 : Kempe, D., Kleinberg, J., and Tardos, E. Maximizing the spread of influence through a social network. (KDD ’03)
• SC 06 : Kimura, M., and Saito, K. Tractable models for information diffusion in social networks. (PKDD ’06)
• LKG 07 : Leskovec, J., Krause, A., Guestrin, C., Faloutsos, C., VanBriesen, J., and Glance, N. Cost-effective outbreak detection in networks. (KDD ’07)
• CWY 09 : Chen, W., Wang, Y., and Yang, S. Efficient influence maximization in social networks.(KDD ’09)
64
• CWW 10 : Chen, W., Wang, C., and Wang, Y. Scalable influence maximization for prevalent viral marketing in large-scale social networks.(KDD ’10)
• WCS 10 : Wang, Y., Cong, G., Song, G., and Xie, K. Community-based greedy algorithm for mining top- k influential nodes in mobile social networks.(KDD ’10)
• JSC 11 : Jiang, Q., Song, G., and Cong, G., Simulated Annealing Based Influence Maximization in Social Networks.(AAAI ’11)
• LYK 12 : Lee, W., Kim, J., and Yu, H., CT-IC: Continuously activated and Time-restricted Independent Cascade Model for Viral Marketing(ICDM ’12)