A glimpse on social influence and link prediction in OSNs

23
A glimpse on social influence and link prediction in OSNs ywords : link creation, link prediction, homophily, social influence Workshop on Data Driven Dynamical Networks Speaker: Luca Maria Aiello, PhD student Università degli Studi di Torino Computer Science Department [email protected]

description

A glimpse on social influence and link prediction in OSNs. Workshop on Data Driven Dynamical Networks. Speaker:. Luca Maria Aiello, PhD student Università degli Studi di Torino Computer Science Department [email protected]. - PowerPoint PPT Presentation

Transcript of A glimpse on social influence and link prediction in OSNs

Page 1: A glimpse on social influence and link prediction in OSNs

A glimpse on social influence and link prediction in OSNs

Keywords : link creation, link prediction, homophily, social influence, aNobii

Workshop on Data Driven Dynamical Networks

Speaker:

Luca Maria Aiello, PhD studentUniversità degli Studi di TorinoComputer Science [email protected]

Page 2: A glimpse on social influence and link prediction in OSNs

Acknowledgments

Giancarlo RuffoRossano Schifanella

Alain BarratCiro Cattuto

People:

Università degli Studi di Torino ISI Foundation

School of Informatics and Computing, Indiana University

Filippo Menczer

Page 3: A glimpse on social influence and link prediction in OSNs

3

Dynamics leading to link creation

Several theories from sociology◦Self-interest◦Mutual-interest◦Exchange◦Contagion (influence)◦Balance◦Homophily◦Proximity

28/09/2010 Les Houches 2010 - Luca Maria Aiello, Università degli Studi di Torino

Food networksCollaboration networks

Social media

2nd part:exploit the observations on these phenomena to predict future links

Page 4: A glimpse on social influence and link prediction in OSNs

428/09/2010 Les Houches 2010 - Luca Maria Aiello, Università degli Studi di Torino

• Dataset• Topical overlap

• Homophily and influence

• Link prediction

• Conclusions

Outline

Page 5: A glimpse on social influence and link prediction in OSNs

528/09/2010 Les Houches 2010 - Luca Maria Aiello, Università degli Studi di Torino

• Dataset• Topical overlap

• Homophily and influence

• Link prediction

• Conclusions

Outline

Page 6: A glimpse on social influence and link prediction in OSNs

6

Social network for bookwormsData-driven analysis on anobii.com

28/09/2010 Les Houches 2010 - Luca Maria Aiello, Università degli Studi di Torino

Social network◦Directed◦Friendship + neighborhood

Profile features◦ Library and wishlist◦Groups ◦Tags

4th snapshot Friendship Neighborhood Union

Nodes 74,908 54,590 86,800

Links 268,655 429,482 697,910

6 snapshots, 15 days apartFull giant connected component

Page 7: A glimpse on social influence and link prediction in OSNs

Basic statistics

28/09/2010 Les Houches 2010 - Luca Maria Aiello, Università degli Studi di Torino 7

Broad distributionsPositive correlations between

connectivity and activityAssortativity

103

102

101

100

100 101 102 103kout

ng(kout)nb(kout)nw(kout)

Page 8: A glimpse on social influence and link prediction in OSNs

8

Triadic closure

Reciprocation is strong (exchange)Users tend to choose “friends of their friends”

as new friends (balance)

28/09/2010 Les Houches 2010 - Luca Maria Aiello, Università degli Studi di Torino

Direct Reciprocated BidirectionalClosure

Doubleclosure

75% 20% 25%30% 10%

Classification of new links at time t+1 between nodes already present at time t (t ∈ {1,…,5})

Page 9: A glimpse on social influence and link prediction in OSNs

928/09/2010 Les Houches 2010 - Luca Maria Aiello, Università degli Studi di Torino

• Dataset• Topical overlap

• Homophily and influence

• Link prediction

• Conclusions

Outline

Page 10: A glimpse on social influence and link prediction in OSNs

10

Profile similarity vs. social distance

Topical overlapStatistical correlation because of assortative biases?Null model to discern real overlap from purely statistical

effects◦No topical overlap other than that caused by statistical

mixing patters28/09/2010 Les Houches 2010 - Luca Maria Aiello, Università degli Studi di Torino

vnun

bbvu

bb

b vub

,

Does similarity between user profiles depend on the social distance?

Page 11: A glimpse on social influence and link prediction in OSNs

SocialCom 2010 - Luca Maria Aiello, Università degli Studi di Torino 11

Geographical overlap

Null model test with random link rewire

Country-level overlap due to language barriers

City level overlap

22/08/2010

Page 12: A glimpse on social influence and link prediction in OSNs

1228/09/2010 Les Houches 2010 - Luca Maria Aiello, Università degli Studi di Torino

• Dataset• Topical overlap

• Homophily and influence

• Link prediction

• Conclusions

Outline

Page 13: A glimpse on social influence and link prediction in OSNs

13

Causality between similarity and link creation

Topical overlap is observed for all profile features

Three possible explanations:1. Homophily (people connect with similar people)2. Social influence (social connection conveys

similarity)3. Mixture of the two

Explore the causality relationship between profile similarity and social linking

28/09/2010 Les Houches 2010 - Luca Maria Aiello, Università degli Studi di Torino

What is the cause of topical overlap?

Page 14: A glimpse on social influence and link prediction in OSNs

14

Similarity link creation (homophily)

28/09/2010 Les Houches 2010 - Luca Maria Aiello, Università degli Studi di Torino

⟨ncb⟩ σb ⟨ncg⟩ σg

duv = 2 9.5 0.02 1.12 0.05

u → v 12.9 0.04 1.10 0.08

u ↔ v 18.5 0.04 1.67 0.11Closure 18.2 0.04 1.81 0.10Dbl closure 23.4 0.05 1.20 0.12

Average similarity of pairs forming new links between t and t+1 (t=4), compared with average similarity of all the pairs at distance 2 at time t

Pairs that are going to get connected show a substantially higher similarity

Page 15: A glimpse on social influence and link prediction in OSNs

15

Link creation similarity (influence)

28/09/2010 Les Houches 2010 - Luca Maria Aiello, Università degli Studi di Torino

Evolution of the similarity between pairs linking together at different times

Groups

Books

Page 16: A glimpse on social influence and link prediction in OSNs

16

Summary

Theories to explain link creation◦Self-interest◦Mutual-interest◦Exchange Reciprocity in linking◦Contagion Social influence◦Balance Triangle closure◦Homophily For all profile features◦Proximity Geographical and on social graph

28/09/2010 Les Houches 2010 - Luca Maria Aiello, Università degli Studi di Torino

Can we exploit the observations on these phenomena to predict future links?

Page 17: A glimpse on social influence and link prediction in OSNs

1728/09/2010 Les Houches 2010 - Luca Maria Aiello, Università degli Studi di Torino

• Dataset• Topical overlap

• Homophily and influence

• Link prediction

• Conclusions

Outline

Page 18: A glimpse on social influence and link prediction in OSNs

18

Link prediction

Snapshots at time t and t+1Predict links created between t and t+1 given the

whole information at time tSupervised learning approach to combine profile and

structural features

28/09/2010 Les Houches 2010 - Luca Maria Aiello, Università degli Studi di Torino

Pair Id Library sim. Common neighbors Will be connected?

1 0.56 18 1

2 0.11 5 0

3 0.71 36 1

Learning set example

Page 19: A glimpse on social influence and link prediction in OSNs

FeaturesProfile

◦ Library (cosine)◦ Groups (cosine)◦ Groups (size)

◦ Gender {0,1}◦ Town {0,1}◦ Age (|age1 – age2|)◦ Country {0,1}◦ Vocabulary (cosine)◦ Wishlists (cosine)◦ Tagging behavior

28/09/2010 Les Houches 2010 - Luca Maria Aiello, Università degli Studi di Torino 19

Structural◦ Common neighbors◦ Distance on graph◦ Preferential attachment

◦ Resource allocation

◦ Local path

)()( )(1

yxzxy zks

)()( ykxksxy

]1,0[,32 AAS

)()( ||1

yGxGgxy gs

Page 20: A glimpse on social influence and link prediction in OSNs

20

Link prediction: preliminary results

28/09/2010 Les Houches 2010 - Luca Maria Aiello, Università degli Studi di Torino

Rotation forest, 10-fold cross-validation, balanced sets

Rotation forest, 10-fold cross-validation, unbalanced sets

Precision Recall F-measure AUC

Structural 0.782 0.778 0.777 0.838

Topical 0.746 0.746 0.746 0.82

Complete 0.827 0.826 0.826 0.9

Complete

K-ratio Precision Recall F-measure AUC

1:1 0.827 0.826 0.826 0.9

1:10 0.934 0.94 0.933 0.897

1:100 0.988 0.991 0.987 0.86

Page 21: A glimpse on social influence and link prediction in OSNs

2128/09/2010 Les Houches 2010 - Luca Maria Aiello, Università degli Studi di Torino

• Dataset• Topical overlap

• Homophily and influence

• Link prediction

• Conclusions

Outline

Page 22: A glimpse on social influence and link prediction in OSNs

22

Conclusions and future work

Theories on social network growth are verifiedCausality between similarity and social

connectionEffective link detection/prediction

◦Topical information seems to be predictive as well as structural information

RFC:◦ Link prediction sampling/evaluation procedure◦New challenges in prediction

28/09/2010 Les Houches 2010 - Luca Maria Aiello, Università degli Studi di Torino

Page 23: A glimpse on social influence and link prediction in OSNs

Speaker: Luca Maria [email protected]

www.di.unito.it/~aiello

Thank you for your attention!

Workshop on Data Driven Dynamical Networks

Reference:L. M. Aiello, A. Barrat, C. Cattuto, G. Ruffo, R. Schifanella "Link creation and profile alignment in the aNobii social network"In SocialCom'10: Proceedings of the 2nd IEEE International Conference on Social Computing, Minneapolis, MN, USA, August 2010