User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control...

60
User Identities across Social Networks: Quantifying Linkability and Nudging Users to control Linkability Dr. Ponnurangam Kumaraguru (Advisor) Srishti Chandok

Transcript of User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control...

Page 1: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

User Identities across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Dr.  Ponnurangam  Kumaraguru  (Advisor)Srishti  Chandok

Page 2: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Thesis Committee

Dr.  Arun  Balaji  Buduru,  IIIT-­‐Delhi  

Dr.  Anuja  Arora,  JIIT-­‐Noida  

Dr.  Ponnurangam  Kumaraguru  (Chair),  IIIT-­‐Delhi

2

Page 3: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Publication

Our work is accepted as a short paper / 8 pages at Social Informatics 2017.

3

Page 4: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Why people join multiple OSNs???Type of content being shared: ✦ Images ✦ Videos ✦ Short messages ✦ Combination of messages, video and images ✦ Online reviews ✦ Discussion forums

Type of network being offered: ✦ Professional network ✦ Personal network

4

Page 5: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Notion of LinkabilityLinkability   is   a   metric   which   quantifies   closeness   between   two  identities  belonging  to  the  same  user  on  different  social  networks

5

Username: RaineRamirez1 Name: Chris Raine Ramirez Location: Caloocan City Website: NULL

Username: Rainevouz Name: Christopher Delgar Ramirez Bakunawa Location: San Jose del Monte, Bulacan Website: NULL

0.31Linkability Score =

There is a 31% chance that Rainevouz & RaineRamirez1

is the same person

Page 6: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Motivation

6

✦ Social audience = 437,632 + 153,000 + 805,097 or less??

✦ Targeted Marketing using aggregated data

De-duplicating audience - finding linkability across OSNs

Page 7: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Motivation

6

Social Engineering - Aggregation of information makes attacking easy

Page 8: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

8

Page 9: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

9

1. Hacker studies Karla who is active on

2. Information aggregation: • Likes playing guitar

and softball • Works on making

reports at the office and recently received an award

• Likes wine • Nick is her boss and

Curt is her colleague

3. Hacker crafts an email

4. Downloaded on Karla’s machine

5. Hacker installs a remote access tool on the machine

Page 10: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Primary School name on

10

Motivation

Streets close to ‘xyz’ school

‘xyz’ schoolSecurity Question:

Street where you grew up?

Cracking passwords - Personal data across multiple social networks to gather answers for password recovery questions

Page 11: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Research Aim

“  Develop  a  real-­‐time  system  which  can  help  users  to  maintain  their  linkability  across  social  networks.  ”

AIM

Linkability Score

Computation

Linkability Nudge Design

11

Page 12: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Current State of Art

Identity Resolution

12

Privacy Nudges

Linkability Score Linkability Nudge

Page 13: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Current State of Art

Identity Resolution

13

Privacy Nudges

Page 14: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Identity Resolution(Contd..)Identity Resolution:✦ MOBIUS - ACM, SIGKDD (2013), 91% accuracy✦ NEMO - ACM, HyperText (2015), 41% accuracy✦ HYDRA - ACM, SIGMOD (2014)

14

Page 15: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Current State of Art

Identity Resolution

15

Privacy Nudges

Page 16: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Privacy Nudges (Contd..)Privacy Nudges:✦Wang, Yang, et al. Privacy nudges for social media: an exploratory

Facebook study - Profile Picture nudge, timer nudge and sentiment nudge

16

Page 17: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

17

✦Ronald. Context is everything sociality and privacy in online social network sites - Segregation of audience for profile attributes of user on OSNs so that its visibility is controllable.

Privacy Nudges (Contd..)

Page 18: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Novelty

18‹#›

Nudging users to prevent disclosures owing to the resolution of their multiple identities

Identity Resolution

Privacy Nudges

Page 19: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Architecture Diagram

Linkability Score

Linkability score

exceeds range?

Linkability Nudge

NO

YES

19

Activities performed on OSNs

Recomputation

Page 20: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Linkability Score

Linkability Score

Linkability score

exceeds range?

Linkability Nudge

NO

YES

20

Activities performed on OSNs

Recomputation

1. Baseline Methods

Weighted Sum Probabilistic

2. Reformed Linkability Score

Page 21: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Linkability Score

Linkability Score

Linkability score

exceeds range?

Linkability Nudge

NO

YES

21

Activities performed on OSNs

Recomputation

1. Baseline Methods

Weighted Sum Probabilistic

2. Reformed Linkability Score

Page 22: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Baseline Methods

1.  Data  Collection  

2.  Methods

22

Weighted  Sum  Method  

Probabilistic  Method  

Page 23: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Data Collection

23

Page 24: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Data Collection (Contd..)

24

Streaming  API  

Tweets  

Fb.me  links  

Link Expander

Database

http://fb.me/8dR49RHpQ

https://www.facebook.com/christie.andresen/posts/10210420711856356

Page 25: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Description of Data Count

Positive Data 23,985

Negative Data (Type I) 96,130

Negative Data (Type II) 24,560

Positive Data: <IFb> = <ITw>, identities are same [bob12, bob12]

Negative Data (Type I): <IFb> ≠ <ITw> but the identities appear to be similar [bob_c, bob_d]

Negative Data (Type II):> <IFb> ≠ <ITw> and the identities appear to be dissimilar [bob, alice]

Data Collection (Contd..)

25

Page 26: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Weighted Sum Method

Weighted Sum

Calculator

26

Feature Extractor

Metric Calculator

Linkability Score

Linkability = ∑wi * fi Score ∑wi

Page 27: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Weighted Sum Method

Username Name

Location Website

Longest Common Subsequence, Edit Distance, etc

Weighted Sum

Calculator0.31

27

Username = 0.39

Name = 0.28

Location = 0.6

Website = 0

Username: RaineRamirez1 Name: Chris Raine Ramirez Location: Caloocan City Website: NULL

Username: Rainevouz Name: Christopher Delgar Ramirez Bakunawa Location: San Jose del Monte, Bulacan Website: NULL

Feature Extractor

Metric Calculator

Linkability Score

Linkability = ∑wi * fi Score ∑wi

Page 28: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Need for different Weights

Feature weights = 1,1,1,1 for username, name, geo-location and website/url, respectively.

Feature weights = 2,3,4,1 for username, name, geo-location and website/url, respectively.

To increase the difference between positive data and negative data thereby, ensuring that negative data (Type I) identity would not be mistaken to be as positive identity.

28

Page 29: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

29

Feature Extractor

Metric Calculator

Linkability Score

Probability Finder

Linkability = Pr(pd) Score [Pr(pd) + Pr(nd1) + Pr(nd2) ]

Probabilistic Method

Page 30: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Probabilistic Method

30

Username Name

Location Website

Longest Common Subsequence, Edit Distance, etc

Probability Finder

0.12

Username = 0.39

Name = 0.28

Location = 0.6

Website = 0

Username: RaineRamirez1 Name: Chris Raine Ramirez Location: Caloocan City Website: NULL

Username: Rainevouz Name: Christopher Delgar Ramirez Bakunawa Location: San Jose del Monte, Bulacan Website: NULL

Feature Extractor

Metric Calculator

Linkability Score

Linkability = Pr(pd) Score [Pr(pd) + Pr(nd1) + Pr(nd2) ]

Page 31: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Comparison of Baseline methods

Accuracy is 87% when threshold value = 0.39 [feature weights used are 2, 3, 4 and 1 for username, name of user, location and website features, respectively]

Accuracy is 32% when threshold value = 0.71

31

Page 32: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Limitations of Baseline MethodsProbabilistic  method  did  not  produce  anticipated  results  with  quiet  low  accuracy.    

Both   the   methods   employ   a   small   set   of   features   namely   name,  username,  geo-­‐location  and  website.  

They  fail  to  capture  user’s  content  sharing  behavior.

32

Page 33: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Take-aways from Baseline MethodsWeighted   Sum   method   performed   better   than   Probabilistic  method.  

We   will   enhance   the   feature   set   using   well   known   identity  resolution   techniques    +   our  proposed  Weighted  Sum  method   to  compute  linkability  scores

33

Page 34: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Linkability Score

Linkability Score

Linkability score

exceeds range?

Linkability Nudge

NO

YES

34

Activities performed on OSNs

Recomputation

1. Baseline Methods

Weighted Sum Probabilistic

2. Reformed Linkability Score

Page 35: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Reformed Linkability ScoreLeverages features from three state-of-the-art identity resolutiontechniques and uses Weighted Sum method to calculate linkabilityscores:

MOBIUSNEMOHYDRA

MOBIUS: http://www.public.asu.edu/~huanliu/papers/kdd2013.pdfNEMO: http://precog.iiitd.edu.in/Publications_files/19wole01-jain.pdfHYDRA: http://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=3650&context=sis_research 35

Page 36: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Reformed Linkability Score

36

Authorize And

Fetch Data

Feature Extraction

NEMO Linkability Score

MOBIUS

HYDRA

Page 37: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

MOBIUS

1 =

srishti.chandok

srishti.chandok

SrishtiChandok

37

Page 38: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

NEMOSrishti.chandok & SrishtiChandok

Srishti Chandok & Srishti Chandok

New Delhi, India & Delhi, India

&

&

1

0.75, 0

1

0.98

1

0.79

38

Username Similarity Score

Name Similarity Score

Location Similarity Score

Profile Image Similarity Score

Content Similarity Score

&Weighted

Sum

Page 39: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

HYDRASrishti Chandok & Srishti Chandok

'Salwan Public School, Rajinder Nagar, New Delhi 110060', 'I P University', 'maharaja agarsen institute of technology', 'IIIT Delhi’

& MTech from IIIT Delhi, BTech from Maharaja Agrasen Institute of Technology, IPU, SPS

IIIT Delhi Teaching Assistant &

MTech from IIIT Delhi, BTech from Maharaja Agrasen Institute of Technology, IPU, SPS

http://www.twitter.com/SrishtiChandok &

twitter.com/srishtichandok

&39

0.32

0.24

1

1

1

0.5

Name Similarity Score

Education Similarity Score

Profession Similarity Score

Website Similarity Score

Content Similarity Score

Weighted Sum

Page 40: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Linkability Nudge

40

Linkability Score

Linkability score

exceeds range?

Linkability Nudge

NO

YES

Activities performed on OSNs

Recomputation

1. Baseline Methods

Weighted Sum Probabilistic

2. Reformed Linkability Score

Page 41: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Linkability Nudge

41

Soft paternalistic intervention

Alerts users whenever user behavior leads to change in linkability score beyond pre-configured range

Page 42: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Components of Linkability NudgeBrowser  Extension  

Nudge  Server  

Linkability  Compute  Server

42

Page 43: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

1. Browser ExtensionMaintains user's identity across the entire user session. Captures user's posting activity and changes in profile attributes on all configured OSM platforms. Displays linkability nudge in various forms (notifications and color).

43

Downloads the Chrome

browser extensionNudge Server

Send User’s Activity Information

Linkability Score and

Piecharts

BrowserExtension

Page 44: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

2. Nudge ServerIntermediary between the browser extension and linkability compute server. Receives user's access token from browser extension and sends them to linkability compute server to obtain user's data. Passes the information pertaining to user's activities like making a post or changing profile attribute to the linkability compute server. Sends across the newly computed linkability scores to the browser extension from time to time based upon user's activities.

44

Access Token for various OSNs

Forward Access Token for various OSNs

Linkability Score Forward User’s Activity Information

Nudge Server

Linkability Compute

ServerBrowser

Extension

Page 45: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

3. Linkability Compute ServerFetches user's data from the API endpoints. Implements the identity resolution methods to compute linkability scores. Receives every user’s activity information (post or profile attribute), recomputes linkability scores and sends them back to nudge server.

45

Nudge Server

Linkability Compute

Server

Identity Resolution Algorithms

NEMO | HYDRA | MOBIUS

Fetch user’s data

Linkability Score

Linkability Score

Page 46: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Nudge DesignContent-driven Color Nudge - ✦ Similar Post --> Red color nudge ✦ Dissimilar Post --> Green color nudge  

Attribute-­‐driven  Notification  Nudge  -­‐    ✦ Profile attribute update -> Linkability Score crosses range ->

Notification Popup nudge

46

Page 47: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Content-driven Color Nudge

47

Page 48: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Attribute-driven Notification Nudge

48

Page 49: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Demo

49

Page 50: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

User Evaluation of the Nudge

50

ControlPeriod

TreatmentPeriod

No exposure to linkability nudgeTasks (Post and profile updates)

Exposure to linkability nudgeTasks (Post and profile updates)

Page 51: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Analysis of User Evaluation58%   of   the   participants:   Understood   the   broad   concept   of  linkability  score    

42%   of   participants:   More   aware   about   the   linkability   of   their  multiple  identities  across  OSNs  

84%   of   the   participants:   Noticed   the   factors   contributing   to   their  linkability  scores    

83%  of  the  participants:  Liked  Color  nudge  and  pie-­‐charts  more  

Activities  performed  by  one  of  the  participants51

Page 52: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

LimitationsTime   delay   (2-­‐5   seconds)   while   making   post   during   treatment  period.  

Used  uniform  weights  for  computing  linkability  scores.  

System  works  for  three  social  networks.  

Evaluated  the  nudge  with  a  small  number  of  participants.

52

Page 53: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

ConclusionsLeverage  features  from  well  known  methods  for  identity  resolution  (NEM,  HYDRA  and  MOBIUS)  and  use  the  proposed  baseline  method,  Weighted  Sum,  to  compute  the  linkability  scores  

Identify   the   factors   (profile   attributes   and   content)   that   have  contributed  to  the  computed  linkability  score  

Design   and   develop   linkability   nudge,   a   soft   intervention   which  alerts   users  whenever   user   behavior   leads   to   change   in   linkability  score  beyond  preconfigured  range  

 Perform  a  detailed  user  study  in  a  controlled  lab  experiment  setting  to  assess  effectiveness  and  utility  of  proposed  linkability  nudge

53

Page 54: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Acknowledgements

Rishabh  Kaushal,  PhD,  IIIT-­‐Delhi  

Committee  Members  

Sonal  and  Sonu,  Precogers  

Precog  members,  family  and  friends

54

Page 55: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Thank You!

55

Page 56: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Referenceshttps://www.washingtonpost.com/investigations/social-­‐engineering-­‐using-­‐social-­‐media-­‐to-­‐launch-­‐a-­‐cyberattack/2012/09/26/a282c6be-­‐0837-­‐11e2-­‐a10c-­‐fa5a255a9258_graphic.html?utm_term=.148a1a0244af  https://www.infosecurity-­‐magazine.com/news/phishing-­‐and-­‐social-­‐engineering/  https://www.nytimes.com/2014/11/11/world/europe/for-­‐guccifer-­‐hacking-­‐was-­‐easy-­‐prison-­‐is-­‐hard-­‐.html  http://www.securityweek.com/social-­‐media-­‐makes-­‐way-­‐social-­‐engineering  http://www.propertycasualty360.com/2017/07/04/how-­‐social-­‐engineering-­‐fueled-­‐the-­‐cyber-­‐attack-­‐bus

56

Page 57: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Appendix

57

Page 58: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

Feature Name Metricsusername Hamming Distance, Longest Common Subsequence, Edit Distance,

Cosine Distance, Jaccard Distance, Jaro Winkler Distancename Length of Common Substring, Length of Common Prefix & Common

Suffixlocation Length of Common Substring, Geo-location (LAtitude & Longitude)

website Canonical URL matching

Features and Metrics for Baseline Methods

58

Page 59: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

NEMO

59

Page 60: User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control Linkability

HYDRA

60