User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control...
-
Upload
precog -
Category
Engineering
-
view
221 -
download
1
Transcript of User Identities Across Social Networks: Quantifying Linkability and Nudging Users to control...
User Identities across Social Networks: Quantifying Linkability and Nudging Users to control Linkability
Dr. Ponnurangam Kumaraguru (Advisor)Srishti Chandok
Thesis Committee
Dr. Arun Balaji Buduru, IIIT-‐Delhi
Dr. Anuja Arora, JIIT-‐Noida
Dr. Ponnurangam Kumaraguru (Chair), IIIT-‐Delhi
2
Publication
Our work is accepted as a short paper / 8 pages at Social Informatics 2017.
3
Why people join multiple OSNs???Type of content being shared: ✦ Images ✦ Videos ✦ Short messages ✦ Combination of messages, video and images ✦ Online reviews ✦ Discussion forums
Type of network being offered: ✦ Professional network ✦ Personal network
4
Notion of LinkabilityLinkability is a metric which quantifies closeness between two identities belonging to the same user on different social networks
5
Username: RaineRamirez1 Name: Chris Raine Ramirez Location: Caloocan City Website: NULL
Username: Rainevouz Name: Christopher Delgar Ramirez Bakunawa Location: San Jose del Monte, Bulacan Website: NULL
0.31Linkability Score =
There is a 31% chance that Rainevouz & RaineRamirez1
is the same person
Motivation
6
✦ Social audience = 437,632 + 153,000 + 805,097 or less??
✦ Targeted Marketing using aggregated data
De-duplicating audience - finding linkability across OSNs
Motivation
6
Social Engineering - Aggregation of information makes attacking easy
8
9
1. Hacker studies Karla who is active on
2. Information aggregation: • Likes playing guitar
and softball • Works on making
reports at the office and recently received an award
• Likes wine • Nick is her boss and
Curt is her colleague
3. Hacker crafts an email
4. Downloaded on Karla’s machine
5. Hacker installs a remote access tool on the machine
Primary School name on
10
Motivation
Streets close to ‘xyz’ school
‘xyz’ schoolSecurity Question:
Street where you grew up?
Cracking passwords - Personal data across multiple social networks to gather answers for password recovery questions
Research Aim
“ Develop a real-‐time system which can help users to maintain their linkability across social networks. ”
AIM
Linkability Score
Computation
Linkability Nudge Design
11
Current State of Art
Identity Resolution
12
Privacy Nudges
Linkability Score Linkability Nudge
Current State of Art
Identity Resolution
13
Privacy Nudges
Identity Resolution(Contd..)Identity Resolution:✦ MOBIUS - ACM, SIGKDD (2013), 91% accuracy✦ NEMO - ACM, HyperText (2015), 41% accuracy✦ HYDRA - ACM, SIGMOD (2014)
14
Current State of Art
Identity Resolution
15
Privacy Nudges
Privacy Nudges (Contd..)Privacy Nudges:✦Wang, Yang, et al. Privacy nudges for social media: an exploratory
Facebook study - Profile Picture nudge, timer nudge and sentiment nudge
16
17
✦Ronald. Context is everything sociality and privacy in online social network sites - Segregation of audience for profile attributes of user on OSNs so that its visibility is controllable.
Privacy Nudges (Contd..)
Novelty
18‹#›
Nudging users to prevent disclosures owing to the resolution of their multiple identities
Identity Resolution
Privacy Nudges
Architecture Diagram
Linkability Score
Linkability score
exceeds range?
Linkability Nudge
NO
YES
19
Activities performed on OSNs
Recomputation
Linkability Score
Linkability Score
Linkability score
exceeds range?
Linkability Nudge
NO
YES
20
Activities performed on OSNs
Recomputation
1. Baseline Methods
Weighted Sum Probabilistic
2. Reformed Linkability Score
Linkability Score
Linkability Score
Linkability score
exceeds range?
Linkability Nudge
NO
YES
21
Activities performed on OSNs
Recomputation
1. Baseline Methods
Weighted Sum Probabilistic
2. Reformed Linkability Score
Baseline Methods
1. Data Collection
2. Methods
22
Weighted Sum Method
Probabilistic Method
Data Collection
23
Data Collection (Contd..)
24
Streaming API
Tweets
Fb.me links
Link Expander
Database
http://fb.me/8dR49RHpQ
https://www.facebook.com/christie.andresen/posts/10210420711856356
Description of Data Count
Positive Data 23,985
Negative Data (Type I) 96,130
Negative Data (Type II) 24,560
Positive Data: <IFb> = <ITw>, identities are same [bob12, bob12]
Negative Data (Type I): <IFb> ≠ <ITw> but the identities appear to be similar [bob_c, bob_d]
Negative Data (Type II):> <IFb> ≠ <ITw> and the identities appear to be dissimilar [bob, alice]
Data Collection (Contd..)
25
Weighted Sum Method
Weighted Sum
Calculator
26
Feature Extractor
Metric Calculator
Linkability Score
Linkability = ∑wi * fi Score ∑wi
Weighted Sum Method
Username Name
Location Website
Longest Common Subsequence, Edit Distance, etc
Weighted Sum
Calculator0.31
27
Username = 0.39
Name = 0.28
Location = 0.6
Website = 0
Username: RaineRamirez1 Name: Chris Raine Ramirez Location: Caloocan City Website: NULL
Username: Rainevouz Name: Christopher Delgar Ramirez Bakunawa Location: San Jose del Monte, Bulacan Website: NULL
Feature Extractor
Metric Calculator
Linkability Score
Linkability = ∑wi * fi Score ∑wi
Need for different Weights
Feature weights = 1,1,1,1 for username, name, geo-location and website/url, respectively.
Feature weights = 2,3,4,1 for username, name, geo-location and website/url, respectively.
To increase the difference between positive data and negative data thereby, ensuring that negative data (Type I) identity would not be mistaken to be as positive identity.
28
29
Feature Extractor
Metric Calculator
Linkability Score
Probability Finder
Linkability = Pr(pd) Score [Pr(pd) + Pr(nd1) + Pr(nd2) ]
Probabilistic Method
Probabilistic Method
30
Username Name
Location Website
Longest Common Subsequence, Edit Distance, etc
Probability Finder
0.12
Username = 0.39
Name = 0.28
Location = 0.6
Website = 0
Username: RaineRamirez1 Name: Chris Raine Ramirez Location: Caloocan City Website: NULL
Username: Rainevouz Name: Christopher Delgar Ramirez Bakunawa Location: San Jose del Monte, Bulacan Website: NULL
Feature Extractor
Metric Calculator
Linkability Score
Linkability = Pr(pd) Score [Pr(pd) + Pr(nd1) + Pr(nd2) ]
Comparison of Baseline methods
Accuracy is 87% when threshold value = 0.39 [feature weights used are 2, 3, 4 and 1 for username, name of user, location and website features, respectively]
Accuracy is 32% when threshold value = 0.71
31
Limitations of Baseline MethodsProbabilistic method did not produce anticipated results with quiet low accuracy.
Both the methods employ a small set of features namely name, username, geo-‐location and website.
They fail to capture user’s content sharing behavior.
32
Take-aways from Baseline MethodsWeighted Sum method performed better than Probabilistic method.
We will enhance the feature set using well known identity resolution techniques + our proposed Weighted Sum method to compute linkability scores
33
Linkability Score
Linkability Score
Linkability score
exceeds range?
Linkability Nudge
NO
YES
34
Activities performed on OSNs
Recomputation
1. Baseline Methods
Weighted Sum Probabilistic
2. Reformed Linkability Score
Reformed Linkability ScoreLeverages features from three state-of-the-art identity resolutiontechniques and uses Weighted Sum method to calculate linkabilityscores:
MOBIUSNEMOHYDRA
MOBIUS: http://www.public.asu.edu/~huanliu/papers/kdd2013.pdfNEMO: http://precog.iiitd.edu.in/Publications_files/19wole01-jain.pdfHYDRA: http://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=3650&context=sis_research 35
Reformed Linkability Score
36
Authorize And
Fetch Data
Feature Extraction
NEMO Linkability Score
MOBIUS
HYDRA
MOBIUS
1 =
srishti.chandok
srishti.chandok
SrishtiChandok
37
NEMOSrishti.chandok & SrishtiChandok
Srishti Chandok & Srishti Chandok
New Delhi, India & Delhi, India
&
&
1
0.75, 0
1
0.98
1
0.79
38
Username Similarity Score
Name Similarity Score
Location Similarity Score
Profile Image Similarity Score
Content Similarity Score
&Weighted
Sum
HYDRASrishti Chandok & Srishti Chandok
'Salwan Public School, Rajinder Nagar, New Delhi 110060', 'I P University', 'maharaja agarsen institute of technology', 'IIIT Delhi’
& MTech from IIIT Delhi, BTech from Maharaja Agrasen Institute of Technology, IPU, SPS
IIIT Delhi Teaching Assistant &
MTech from IIIT Delhi, BTech from Maharaja Agrasen Institute of Technology, IPU, SPS
http://www.twitter.com/SrishtiChandok &
twitter.com/srishtichandok
&39
0.32
0.24
1
1
1
0.5
Name Similarity Score
Education Similarity Score
Profession Similarity Score
Website Similarity Score
Content Similarity Score
Weighted Sum
Linkability Nudge
40
Linkability Score
Linkability score
exceeds range?
Linkability Nudge
NO
YES
Activities performed on OSNs
Recomputation
1. Baseline Methods
Weighted Sum Probabilistic
2. Reformed Linkability Score
Linkability Nudge
41
Soft paternalistic intervention
Alerts users whenever user behavior leads to change in linkability score beyond pre-configured range
Components of Linkability NudgeBrowser Extension
Nudge Server
Linkability Compute Server
42
1. Browser ExtensionMaintains user's identity across the entire user session. Captures user's posting activity and changes in profile attributes on all configured OSM platforms. Displays linkability nudge in various forms (notifications and color).
43
Downloads the Chrome
browser extensionNudge Server
Send User’s Activity Information
Linkability Score and
Piecharts
BrowserExtension
2. Nudge ServerIntermediary between the browser extension and linkability compute server. Receives user's access token from browser extension and sends them to linkability compute server to obtain user's data. Passes the information pertaining to user's activities like making a post or changing profile attribute to the linkability compute server. Sends across the newly computed linkability scores to the browser extension from time to time based upon user's activities.
44
Access Token for various OSNs
Forward Access Token for various OSNs
Linkability Score Forward User’s Activity Information
Nudge Server
Linkability Compute
ServerBrowser
Extension
3. Linkability Compute ServerFetches user's data from the API endpoints. Implements the identity resolution methods to compute linkability scores. Receives every user’s activity information (post or profile attribute), recomputes linkability scores and sends them back to nudge server.
45
Nudge Server
Linkability Compute
Server
Identity Resolution Algorithms
NEMO | HYDRA | MOBIUS
Fetch user’s data
Linkability Score
Linkability Score
Nudge DesignContent-driven Color Nudge - ✦ Similar Post --> Red color nudge ✦ Dissimilar Post --> Green color nudge
Attribute-‐driven Notification Nudge -‐ ✦ Profile attribute update -> Linkability Score crosses range ->
Notification Popup nudge
46
Content-driven Color Nudge
47
Attribute-driven Notification Nudge
48
Demo
49
User Evaluation of the Nudge
50
ControlPeriod
TreatmentPeriod
No exposure to linkability nudgeTasks (Post and profile updates)
Exposure to linkability nudgeTasks (Post and profile updates)
Analysis of User Evaluation58% of the participants: Understood the broad concept of linkability score
42% of participants: More aware about the linkability of their multiple identities across OSNs
84% of the participants: Noticed the factors contributing to their linkability scores
83% of the participants: Liked Color nudge and pie-‐charts more
Activities performed by one of the participants51
LimitationsTime delay (2-‐5 seconds) while making post during treatment period.
Used uniform weights for computing linkability scores.
System works for three social networks.
Evaluated the nudge with a small number of participants.
52
ConclusionsLeverage features from well known methods for identity resolution (NEM, HYDRA and MOBIUS) and use the proposed baseline method, Weighted Sum, to compute the linkability scores
Identify the factors (profile attributes and content) that have contributed to the computed linkability score
Design and develop linkability nudge, a soft intervention which alerts users whenever user behavior leads to change in linkability score beyond preconfigured range
Perform a detailed user study in a controlled lab experiment setting to assess effectiveness and utility of proposed linkability nudge
53
Acknowledgements
Rishabh Kaushal, PhD, IIIT-‐Delhi
Committee Members
Sonal and Sonu, Precogers
Precog members, family and friends
54
Thank You!
55
Referenceshttps://www.washingtonpost.com/investigations/social-‐engineering-‐using-‐social-‐media-‐to-‐launch-‐a-‐cyberattack/2012/09/26/a282c6be-‐0837-‐11e2-‐a10c-‐fa5a255a9258_graphic.html?utm_term=.148a1a0244af https://www.infosecurity-‐magazine.com/news/phishing-‐and-‐social-‐engineering/ https://www.nytimes.com/2014/11/11/world/europe/for-‐guccifer-‐hacking-‐was-‐easy-‐prison-‐is-‐hard-‐.html http://www.securityweek.com/social-‐media-‐makes-‐way-‐social-‐engineering http://www.propertycasualty360.com/2017/07/04/how-‐social-‐engineering-‐fueled-‐the-‐cyber-‐attack-‐bus
56
Appendix
57
Feature Name Metricsusername Hamming Distance, Longest Common Subsequence, Edit Distance,
Cosine Distance, Jaccard Distance, Jaro Winkler Distancename Length of Common Substring, Length of Common Prefix & Common
Suffixlocation Length of Common Substring, Geo-location (LAtitude & Longitude)
website Canonical URL matching
Features and Metrics for Baseline Methods
58
NEMO
59
HYDRA
60