@I seek 'fb.me': Identifying Users across Multiple Online Social Networks

32
@I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks Workshop on Web of Linked En11es (WoLE) Paridhi Jain , Ponnurangam Kumaraguru , Anupam Joshi * Indraprastha Ins1tute of Informa1on Technology (IIITDelhi) *University of Maryland, Bal1more County (UMBC) 1

description

An online user joins multiple social networks in order to enjoy different services. On each joined social network, she creates an identity and constitutes its three major dimensions namely profile, content and connection network. She largely governs her identity formulation on any social network and therefore can manipulate multiple aspects of it. With no global identifier to mark her presence uniquely in the online domain, her online identities remain unlinked, isolated and difficult to search. Literature has proposed identity search methods on the basis of profile attributes, but has left the other identity dimensions e.g. content and network, unexplored. In this work, we introduce two novel identity search algorithms based on content and network attributes and improve on traditional identity search algorithm based on profile attributes of a user. We apply proposed identity search algorithms to find a user's identity on Facebook, given her identity on Twitter. We report that a combination of proposed identity search algorithms found Facebook identity for 39% of Twitter users searched while traditional method based on profile attributes found Facebook identity for only 27.4\%. Each proposed identity search algorithm access publicly accessible attributes of a user on any social network. We deploy an identity resolution system, Finding Nemo, which uses proposed identity search methods to find a Twitter user's identity on Facebook. We conclude that inclusion of more than one identity search algorithm, each exploiting distinct dimensional attributes of an identity, helps in improving the accuracy of an identity resolution process.

Transcript of @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

Page 1: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

@I seek ‘fb.me’: Identifying Users across Multiple

Online Social NetworksWorkshop  on  Web  of  Linked  En11es  (WoLE)

Paridhi  Jain¶,  Ponnurangam  Kumaraguru¶,  Anupam  Joshi*¶Indraprastha  Ins1tute  of  Informa1on  Technology  (IIIT-­‐Delhi)

*University  of  Maryland,  Bal1more  County  (UMBC)

1

Page 2: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Motivation

2

Multiple OSNs

Multiple Identities

Difficult to manage? Difficult to find?

2

Page 3: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Motivation

2

Multiple OSNs

Multiple Identities

Difficult to manage? Difficult to find?

Social Aggregation site

2

Page 4: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Motivation

2

Multiple OSNs

Multiple Identities

Difficult to manage? Difficult to find?

Social Aggregation site Friend  Finder?Malicious  user?Influen1al  user?User  of  interest?

2

Page 5: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Motivation

2

Multiple OSNs

Multiple Identities

Identity Resolution Problem

Difficult to manage? Difficult to find?

Social Aggregation site Friend  Finder?Malicious  user?Influen1al  user?User  of  interest?

2

Page 6: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Identity Resolution

• For a user I, given a user identity IA on a social network A, find user identity IB on social network B.

3

Alice ??

{IA} {IB}

3

Page 7: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Identity Resolution = Identity Search + Identity Matching

• Identity Search

For a user I, given her identity IA on a social network A, and a search parameter S, find the set of identities IBj on social network B such that S(IA) ⋍ S(IB).

{IA,S} {IB1, ... IBj, ... , IBN} = Q

• Identity Matching

Given a user identity IA on a social network A, a set of candidate identities Q on social network B, and a match function M, locate an identity pair (IA, IBj) such that M(IA, IBj) = max{M(IA, IB1), M(IA, IBN)}

{IA, Q, M} {IA, IBj} {IB}

4

4

Page 8: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Research Gaps?� Till  now,  focus  on  bePer  iden1ty  matching  algorithms

� Only  profile  aPributes  (private  and  public)  for  Iden1ty  Search

� Limita1ons  of  Profile  Search  -­‐� Restric1ve  search,  owing  to  non-­‐availability  of  common  aPributes  across  networks.  [Gender  on  Facebook,  but  not  on  TwiPer]

� Search  with  Limited  aPributes  →  Large  candidate  set  size  →  Intensive  Iden1ty  Matching  computa1on

� Users  may  choose  different  profile  aPributes  →  Miss  out  correct  iden1ty  in  the  candidate  set

� LiPle  research  on  using  content  and  network  aPributes  to  search  for  candidate  iden11es

� Extensive  use  of  both  private  and  public  aPributes.  Need  user  authoriza1on  for  iden1ty  search

5

5

Page 9: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Research Gaps?� Till  now,  focus  on  bePer  iden1ty  matching  algorithms

� Only  profile  aPributes  (private  and  public)  for  Iden1ty  Search

� Limita1ons  of  Profile  Search  -­‐� Restric1ve  search,  owing  to  non-­‐availability  of  common  aPributes  across  networks.  [Gender  on  Facebook,  but  not  on  TwiPer]

� Search  with  Limited  aPributes  →  Large  candidate  set  size  →  Intensive  Iden1ty  Matching  computa1on

� Users  may  choose  different  profile  aPributes  →  Miss  out  correct  iden1ty  in  the  candidate  set

� LiPle  research  on  using  content  and  network  aPributes  to  search  for  candidate  iden11es

� Extensive  use  of  both  private  and  public  aPributes.  Need  user  authoriza1on  for  iden1ty  search

6

6

Page 10: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Research Gaps?� Till  now,  focus  on  bePer  iden1ty  matching  algorithms

� Only  profile  aPributes  (private  and  public)  for  Iden1ty  Search

� Limita1ons  of  Profile  Search  -­‐� Restric1ve  search,  owing  to  non-­‐availability  of  common  aPributes  across  networks.  [Gender  on  Facebook,  but  not  on  TwiPer]

� Search  with  Limited  aPributes  →  Large  candidate  set  size  →  Intensive  Iden1ty  Matching  computa1on

� Users  may  choose  different  profile  aPributes  →  Miss  out  correct  iden1ty  in  the  candidate  set

� LiPle  research  on  using  content  and  network  aPributes  to  search  for  candidate  iden11es

� Extensive  use  of  both  private  and  public  aPributes.  Need  user  authoriza1on  for  iden1ty  search

7

7

Page 11: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Proposal� Include  content  and  network  aPributes  as  search  parameters

� Access  only  publicly  accessible  aPributes

� Focus  on  two  popular  social  networks  -­‐  TwiPer  and  Facebook

8

8

Page 12: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Contribution� Proposed  novel  iden1ty  search  methods  on  social  networks

� Our  iden1ty  resolu1on  methods  return  correct  Facebook  iden1ty  for  39%  TwiPer  users  within  top-­‐2  ranks

� We  observe  an  increase  in  accuracy  of  iden1ty  resolu1on  by  11.6%  owing  to  inclusion  of  content  and  network  iden1ty  search,  along  with  improvised  profile  search

9

9

Page 13: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Methodology

10

?

?

?

? Syntactic and Image

Search Match

If self-identified / returned by

more than one search method

No

Yes

Candidate Identities

Manual Verification

10

Page 14: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Identity Matching

� Syntac1c  Matching� Jaro  Distance  comparison  between  username  and  name� Example:  {alice123,  jane_alice},  {Alice  Naura,  Alice  N.  Janice}

� Image  Matching

where  hIA  and  hIBj  are  the  RGB  histograms  of  the  profile  image  and  Ns  represent  histogram  size  of  IA

11

11

Page 15: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Profile Search

12

Self  -­‐  Iden1fica1on  

12

Page 16: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Profile Search

12

Self  -­‐  Iden1fica1on  

12

Page 17: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Content Search

13

13

Page 18: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Content Search

13

13

Page 19: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Self-mention Search

14

14

Page 20: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Self-mention Search

14

14

Page 21: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Network Search

15

15

Page 22: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Instance,

16

16

Page 23: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Instance,

16

Public  Friend  List  of  a  user  extracted  from  public  feeds

16

Page 24: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Integrated System -

17

17

Page 25: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Evaluation

18

Method (543 users) # of users % Accurate

Profile (P) 205 37.7

Content (C + SM) 34 6.3

Network (N) 1 0.2

Finding Nemo 212 39

Dataset # of usersSocial Graph API 543

18

Page 26: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Evaluation

18

Method (543 users) # of users % Accurate

Profile (P) 205 37.7

Content (C + SM) 34 6.3

Network (N) 1 0.2

Finding Nemo 212 39

Search Algorithm # of users identified Accuracy

P (without URL) 149 27.4%

P (with URL) + C + N + SM

149+56+6+1 = 149+71

27.4% +11.6%

Dataset # of usersSocial Graph API 543

18

Page 27: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Mean Average Precision

19

Matching algorithm MAP Score

Image (profile image) 0.83

Syntactic (username) 0.76

Syntactic (name) 0.80

19

Page 28: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Demo

20

hPp://www.youtube.com/watch?v=-­‐AFsCtKwO0c

20

Page 29: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Take away

Inclusion  of  content  and  network  a9ributes  for  iden1ty  search  not  only  improves  iden1ty  resolu1on  accuracy  but  returns  

correct  Facebook  iden1ty  within  top-­‐2  ranks  for  majority  of  the  TwiPer  users.

21

21

Page 30: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Current and Future Work

� Extend  the  social  networks  to  search  for  a  given  iden1ty.  Example,  Google+,  Foursquare,  etc.

� Extend  the  search  methods  to  include  social-­‐network  specific  features

� Find  mul1ple  (fake)  iden11es  of  users  within  social  networks

22

22

Page 31: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

13/05/13 @I seek ‘fb.me’: Identifying Users across Multiple Online Social Networks

Questions?

[email protected],  [email protected],  [email protected]

Paper:  hPp://precog.iiitd.edu.in/publica1ons.html

23

23

Page 32: @I seek 'fb.me': Identifying Users across Multiple Online Social Networks

For  any  further  informa1on,  please  write  to  [email protected]

precog.iiitd.edu.in

24