PIF: A Personalized Fine-Grained Spam Filtering · PDF filePIF: A Personalized Fine-Grained...
Embed Size (px)
Transcript of PIF: A Personalized Fine-Grained Spam Filtering · PDF filePIF: A Personalized Fine-Grained...
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 2, NO. 3, SEPTEMBER 2015 41
PIF: A Personalized Fine-Grained Spam FilteringScheme With Privacy Preservation in Mobile
Social NetworksKuan Zhang, Student Member, IEEE, Xiaohui Liang, Member, IEEE, Rongxing Lu, Senior Member, IEEE,
and Xuemin Shen, Fellow, IEEE
AbstractMobile social network (MSN) emerges as a promisingsocial network paradigm that enables mobile users informationsharing in the proximity and facilitates their cyberphysicalsocialinteractions. As the advertisements, rumors, and spams spreadin MSNs, it is necessary to filter spams before they arrive atthe recipients to make the MSN energy efficient. To this end, wepropose a personalized fine-grained filtering scheme (PIF) withprivacy preservation in MSNs. Specifically, we first develop asocial-assisted filter distribution scheme, where the filter creatorssend filters to their social friends (i.e., filter holders). These fil-ter holders store filters and decide to block spams or relay thedesired packets through coarse-grained and fine-grained keywordfiltering schemes. Meanwhile, the developed cryptographic filter-ing schemes protect creators private information (i.e., keyword)embedded in the filters from directly disclosing to other users. Inaddition, we establish a Merkle Hash tree to store filters as leafnodes where filter creators can check if the distributed filters needto be updated by retrieving the value of root node. It is demon-strated that the PIF can protect users private keywords includedin the filter from disclosure to others and detect forged filters. Wealso conduct the trace-driven simulations to show that the PIF cannot only filter spams efficiently but also achieve high delivery ratioand low latency with acceptable resource consumption.
Index TermsFine-grained, mobile social network (MSN),personalized, privacy preservation, spam filter.
I. INTRODUCTION
M OBILE social network (MSN) has become a promisingsocial networking platform that enables group chat,social gaming, media sharing, and ubiquitous interactionamong nearby users [1]. An MSN can easily be established bysmartphone users in a local area. These users connect to eachother through Bluetooth, WiFi, and device-to-device commu-nications, and form an opportunistic network for a long spanof years or a temporary period (e.g., several hours). For exam-ple, MSNs create rich interaction opportunities for residents in
Manuscript received June 15, 2015; revised January 12, 2016; acceptedJanuary 14, 2016. Date of current version February 24, 2016. This work wassupported by a Research Grant from the Natural Science and EngineeringResearch Council (NSERC), Canada.
K. Zhang and X. Shen are with the Department of Electrical and ComputerEngineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada (e-mail:[email protected]; [email protected]).
X. Liang is with the Department of Computer Science, Universityof Massachusetts at Boston, Boston, MA 02125 USA (e-mail:[email protected]).
R. Lu is with the School of Electrical and Electronics Engineering, NanyangTechnological University, 639798 Singapore (e-mail: [email protected]).
Digital Object Identifier 10.1109/TCSS.2016.2519819
an urban neighborhood, students in a campus area, business-men in a conference, tourists visiting a museum or scenic site,and customers in a shopping mall. In MSNs, users interactionsare enabled anytime and anywhere without any concern of theInternet access and wireless data charge. According to a recentreport from comScore, Instagram users in the United Statesspend 98% of time with their mobile devices instead of desktop,while this percentage for Twitter users is over 86%. As we canimagine, users will have rich and quality service experiencesfrom MSN [2], since it helps users to obtain the desired infor-mation from others (e.g., crowdsourcing) [3] rapidly, efficiently,and pervasively.
MSN users receive various types of information, such asnewsletters, personal posts, rumors, and advertisements, mostof which are of great value to users. For example, in Fig. 1,local stores or restaurants repeatedly disseminate their serviceinformation, flyers, and advertisements to the nearby users. Asaving mom may like to have coupons, baby stuffs, and grocerysale information, while a tourist is interested in tour instructionsand handicrafts. On the other hand, users interests may changeover time. Although users could quickly exchange useful infor-mation in MSNs, they may still receive a portion of the uselessinformation, which is considered as spams [4]. However, thecommunications among MSN users mainly relies on userssmartphones, and happens with their opportunistic contacts.The communication overhead is much expensive due to the lim-ited battery of smartphones and opportunistic contacts amongusers. Therefore, it is crucial to make the communication mean-ingful in MSNs, i.e., transmit desired information to users andfilter spams as early as possible.
According to an investigation by Nexgate, spams over socialmedia have increased about 355% during only the first half of2013, while they are rapidly spreading in social networks suchthat every 1 of 200 social media posts is recognized as spam.Extensive research and industry efforts have been put on spamfiltering in various applications. Several schemes rely on black-list [5] or whitelist to either block spammers or admit legitimatesenders. An alternative way of filtering is to check the contentby matching the keyword associated with the packet [6], [7] orusing machine learning techniques [8] to detect spams. Socialgraph and relevant characteristics are also investigated for spamfiltering [9], [10]. Most of these schemes are performed by acentralized server or trusted authority, and require historicalinformation to detect spams. When spammers are shifting toMSNs, they have more chances of going undetected [11], since
2329-924X 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
42 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 2, NO. 3, SEPTEMBER 2015
Fig. 1. Information dissemination in MSNs.
MSNs have no centralized and trusted servers and lack histori-cal information. To this end, we propose a distributed filteringscheme where MSN users (i.e., filter creators) to personalizetheir spam filters, send them to others (i.e., filter holders), andallow filter holders to filter spams as early as possible. However,there are still many challenging issues for this type of filteringscheme in MSNs. The first challenge is how to distribute filterswith the consideration of both distribution costs and filteringaccuracy. Second, security and privacy concerns are raised,since the distributed filters may contain some sensitive informa-tion of filter creators. If the sensitive information within filters isdirectly exposed to others, it may violate the filter creators pri-vacy, such as health condition, lifestyles, and preferences [12].The third one is how to resist malicious attackers, since the orig-inal filters may be forged to block some useful information.In addition, if a user greedily distributes his filters to all theother users in the network, it may also consume many networkbandwidth and resources of others although it can benefit thisindividual user. These challenging issues motivate us to furtherimprove the filtering accuracy and preserve users privacy at thesame time.
In this paper, we extend our previous conference version onspam filtering [7] and propose a personalized fine-grained spamfiltering scheme (PIF) with privacy preservation in MSNs. ThePIF exploits personalized filters with social-assisted filter dis-tribution, privacy-preserving coarse-grained and fine-grainedfiltering, and efficient filter update. Specifically, the new con-tributions of this paper are threefold.
1) We develop a personalized filtering scheme with privacypreservation. The filter creator defines his filters in bothcoarse-grained and fine-grained manners. The keywordembedded in the coarse-grained filter enables filter hold-ers to forward the packets including the same keywordto the filter creator. The PIF also leverages a variant ofhidden vector encryption to achieve efficient fine-grainedfiltering. Both schemes prevent keywords in the filtersfrom directly disclosing to others.
2) We investigate MSN users social relationship andmobility in MSNs. Then, we exploit the opportunisticcontacts among users to analyze the packet delivery pro-cess in MSNs. Based on the analysis, we propose a
social-assisted filter distribution scheme that enables thefilter creator to send filters to his social friends who havehigh probability to meet him. By doing so, the PIF canreduce the filter distribution overhead and maintain thefiltering accuracy.
3) We conduct extensive simulations to show that the PIFcan significantly reduce the storage and communicationcosts and deliver the useful packets in a low delay.Meanwhile, the security property analysis demonstratesthat the PIF protect users private keyword from directlydisclosing to inside curious attackers and detect forgedfilters.
The remainder of this paper is organized as follows. InSection II, we review the related works on spam filtering.Then, we present the network and security models with designgoals in Section III. We propose the detailed PIF scheme inSection IV, followed by the security property analysis and thesimulations in Sections V and VI, respectively. Finally, weconclude this paper in Section VII.
II. RELATED WORK
Spam filtering has attracted numerous attentions and beenwidely investigated recently [13][15]. Intuitively, some tradi-tional filtering schemes exploit blackli