Detection of Spam Tipping Behaviour on Foursquare
-
Upload
anupama-aggarwal -
Category
Education
-
view
87 -
download
2
description
Transcript of Detection of Spam Tipping Behaviour on Foursquare
![Page 1: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/1.jpg)
Anupama Aggarwal¶, Prof P. Kumaraguru “PK”¶ , Prof J. Almeida*
1
Detection of Spam Tipping Behaviour on
Foursquare
¶ Indraprastha Institute of Information Technology (IIIT-Delhi, India)* Universidade Federal de Minas Gerais (UFMG, Brazil)
![Page 2: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/2.jpg)
2
‣ Location Based Social Network
‣ 33 Million Users *
‣ 3.5 Billion checkins *
‣ 31% of mobile social media users use Foursquare *
* As of January 2013
Foursquare 101
![Page 3: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/3.jpg)
Foursquare 101Your LastCheckin
Venue
Fri
ends
Act
ivit
y
Tip : Suggested Activity for a Venue
Friends Suggestions
Venue Suggestions
Tip can be Liked or Saved
LocationSharing
OSN
![Page 4: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/4.jpg)
4
Spam Tips
‣ Tips unrelated to Venue
Advertising / Marketing
Scam / Phishing
![Page 5: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/5.jpg)
5
Spam according to
Foursquare ToS‣ Tips with links to websites selling software, realtor contact
info, a listing for your business, or other promotion
‣ Tips with inappropriate language or negativity directed at another person
‣ Unauthorized or unsolicited advertising, junk
![Page 6: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/6.jpg)
6
‣ Characterizing irregular user behaviour
‣ We observed different categories of spam users
‣ We characterize features distinguishing these spam users
‣ Automatic detection of spammers
‣ Distinguish between spam and legitimate Foursquare users
‣ Cluster spam users into different categories according to their behaviour
Contributions
![Page 7: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/7.jpg)
7
Data Crawling
2,400,594 tips
613,298 users
![Page 8: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/8.jpg)
Observed Categories of Spam Users
‣ Marketing : These users post tips to promote and advertise a specific product/ brand / venue / external URL
‣ Malicious : Such Foursquare users post external URLs in Tips which direct to spam / phishing / malware websites
‣ Abusive / Derogatory: These users try to deface or bad-mouth another person
‣ Self Promotion: These users try to draw attention to themselves
8
![Page 9: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/9.jpg)
9
Ground Truth DataAnnotation Portal
2,000 Legitimate users
1,900 Spammers
![Page 10: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/10.jpg)
10
‣ User Attributes
‣ Properties of the Foursquare user profile and his checkins
‣ Social Attributes
‣ Friends network of the Foursquare user under inspection
‣ Content Attributes
‣ Details about Tips posted by the Foursquare user
Features used todetect Spammers
![Page 11: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/11.jpg)
11
Features usedCategory χ2 rank Feature
UserAttributes
1 Number of Tips
UserAttributes
3 Ratio of Check-ins and Tips
UserAttributes
4 Number of Check-insUser
Attributes 5 Number of BadgesUser
Attributes11 Number of Mayorships
UserAttributes
12 Ratio of Check-ins and Badges
UserAttributes
15 Number of Photos posted
Social Attributes
6 Number of Friends
ContentAttributes
2 Similarity score of Tips
ContentAttributes
7 Number of URLs posted
ContentAttributes
8 Average number of words in TipsContent
Attributes 9 Average number of characters in TipsContent
Attributes10 Ratio of number of likes and number of Tips
ContentAttributes
13 Average number of spam words in Tips
ContentAttributes
14 Average number of phone-numbers posted in Tips
![Page 12: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/12.jpg)
12
‣ Spammers post same/similar Tips on multiple venues
‣ A large fraction of spam Tips contain URLs
‣ Spam Tips may also have phone numbers
‣ Legitimate users have more Friends
‣ Spammers have very few Friends but large number of Tips
Few Observations
![Page 13: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/13.jpg)
Relation b/w Tips and Checkins
Check-ins
TipsIrregular User Behaviour
![Page 14: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/14.jpg)
14
Tips Distribution
Legitimate users Spammers
![Page 15: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/15.jpg)
15
Classification Results
ClassificationAlgorithm
Precision(Spam)
Precision(Safe)
Recall(Spam)
Recall(Safe)
Accuracy
KNN 83.2% 86.6% 86.3% 83.5% 84.89%
DecisionTree
88.1% 89.2% 88.3% 85.8% 89.53%
RandomForest
89.3% 90.2% 88.3% 90.3% 89.76%
![Page 16: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/16.jpg)
16
‣ Expectation-Maximization (EM) clustering
‣ Spammers Categories -
‣ Advertising / Marketing
‣ Self Promotion
‣ Abusive
‣ Malicious
Detection of Spam Classes
![Page 17: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/17.jpg)
17
‣ Clustering Accuracy for spammer categories -
Detection of Spam Classes
Advertising 88.23%
Self-Promotion 87.23%
Abusive 78.88%
Malicious 0%
![Page 18: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/18.jpg)
18
‣ Analyzed spammers behaviour on Foursquare
‣ We obtained an accuracy of 89.76% with Random Forest classifier to distinguish spammers from legitimate users
‣ We classified the spammers into four broad categories
‣ We were able to to detect users belonging to Advertising, Self-promotion and Abusive categories with an accuracy of 88.23%, 87.23% and 78.88%
Conclusion
![Page 19: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/19.jpg)
19
‣ Refine our methodology by use of other classification algorithms
‣ Use multiclass classification to detect users in any of the spam categories
‣ Correlation of content and the URLs posted by different users can help us in identifying several spam campaigns on Foursquare
Future Work
![Page 20: Detection of Spam Tipping Behaviour on Foursquare](https://reader033.fdocuments.net/reader033/viewer/2022060202/559bfcd61a28ab3d668b472d/html5/thumbnails/20.jpg)
20
Thank You!
Questions ?