Identifying Compromised Accounts on Social Media Using ... › pdf › 1804.07247.pdf · of social...

Identifying Compromised Accounts on Social Media UsingStatistical Text Analysis

Dominic [email protected] of Illinois atUrbana-Champaign

Lunan [email protected] of Illinois atUrbana-Champaign

ChengXiang [email protected]

University of Illinois atUrbana-Champaign

ABSTRACTCompromised accounts on social networks are regular user ac-counts that have been taken over by an entity with malicious intent.Since the adversary exploits the already established trust of a com-promised account, it is crucial to detect these accounts to limit thedamage they can cause. We propose a novel general frameworkfor discovering compromised accounts by semantic analysis of textmessages coming out from an account. Our framework is built onthe observation that normal users will use language that is measur-ably different from the language that an adversary would use whenthe account is compromised. We use our framework to developspecific algorithms that use the difference of language models ofusers and adversaries as features in a supervised learning setup.Evaluation results show that the proposed framework is effectivefor discovering compromised accounts on social networks and aKL-divergence-based language model feature works best.

KEYWORDScompromised accounts, anomaly detection, text mining

1 INTRODUCTIONThe web has significantly changed the way people interact withone another. With the advent of social media, the web has givenhumans a way to directly connect to billions of others. As with anynew technology, there are great opportunities and benefits, but alsodangers and risk. One problem that has become more prominentover the last years is the spread of misinformation in social media[1, 10, 22, 30]. In order to spread misinformation successfully, per-petrators mainly rely on social/spam bots [6, 8, 17] or compromisedaccounts [2, 25, 33]. In this work we address the latter problem ofdetecting compromised accounts in social networks.

Account compromising has become a major issue for publicentities and regular users alike. In 2013, over a quarter millionaccounts on Twitter were compromised [12]. Shockingly, the Timesmagazine reported that Russian hackers were able to compromisea large number of Twitter accounts for the purpose of spreadingmisinformation in the 2016 U.S. Presidential Election [2]. Despitemassive efforts to contain account hacking, it is still an issue today[4, 18, 21]. These incidents are prime examples of the severity ofdamage that an attacker can cause when their messages are backedby the trust that users put into the entity, whose account wascompromised. For regular users a compromised account can bean embarrassing experience, since often questionable posts arepublished with their identity in social media. As a result, 21% ofusers that fall victim to an account hack abandon the social mediaplatform [25].

Compromised accounts are legitimate accounts that a maliciousentity takes control over, with the intention of gaining financialprofit [25] or spreadingmisinformation [7]. These accounts are espe-cially interesting for attackers, as they can exploit the trust networkthat the user has established [20]. For example, this exploitationmakes it more likely that users within the personal network of thecompromised account click on spam links, since they erroneouslybelieve that their source is legitimate. Finding these hijacked ac-counts is challenging, since they exhibit traits similar to regularaccounts [20]. For example, the user’s personal information, friends,likes, etc. are all legitimate at the time of the account takeover. Re-searchers found that the detection of an account takeover can takeup to five days, with 60% of the takeovers lasting an entire day[25]. A time window this large will give attackers ample time toreach their goal. Furthermore, black list matching cannot be usedin the compromised account detection task. Only after analyzingthe changes in the account’s behavior, patterns can be identifiedthat expose the account as being compromised. This task thereforerequires methodology that is specific to accounts that have beencompromised.

In this work, we focus on developing an approach that can dis-criminate compromised from benign accounts by utilizing semantictext analysis. When it comes to textual information, current meth-ods either do not leverage it at all [7, 20] or the textual features aresimplistic. For instance, text is only used to detect language changes(e.g., from English to French) [7] or topics are derived plainly fromhashtags [7, 27]. It is also common to simply use bag-of-word fea-tures [27] or document embeddings [11]. Thus, prior methods fallshort of a deeper semantic analysis of the textual content. Ourmethod follows current approaches on detecting anomalies withinuser profiles, however, we detect anomalies within the semanticsof text as we leverage statistical language representations.

We study the detection of compromised accounts using seman-tic analysis of text and propose a novel general semantic analysisframework based on the observation that a regular user’s textualoutput will differ from an attacker’s textual output. As a specificimplementation of the framework, for a user account under con-sideration, we model the user’s and attacker’s language as twosmoothed multinomial probability distributions that are estimatedusing the textual output of the user and attacker, respectively. Weuse a similarity measure between probability distributions as indi-cators that an account is being compromised. Even though we donot know the begin and end of an account takeover, our methodleverages the fact that the average difference for random accountbegin and end dates will be higher for compromised accounts, com-pared to benign accounts. Using features derived from the similarity

arX

iv:1

804.

0724

7v3

[cs

.SI]

6 M

ay 2

020

measure, we train a classifier that can reliably detect compromisedaccounts.

We further formalize this problem in a threat model and uti-lize this model to simulate account takeovers in our dataset, tocompare an implementation of the proposed framework1 to otherapproaches. In our experiments we find that the proposed methodoutperforms general text classification models and two state-of-the-art approaches for compromised account detection. To showreal-world applicability, we find evidence that non-artificially com-promised accounts can be detected when the algorithm is trainedon simulated data.

Outline. The rest of our paper unfolds as follows: We start bydiscussing related work in Section 2. In Section 3 we define ourthreat model and formalize the problem. In Section 4 we introduceour approach in detail by presenting our general framework, fol-lowed by a concrete instantiation using language models. In Section5 we lay the groundwork for our experiments by stating our re-search questions and howwe are answering them with our work. InSection 6 we demonstrate the viability of our approach and discussour results in terms of our research questions. We conclude anddiscuss future work in Section 7.

2 RELATEDWORKAttackers gain control of regular user accounts either directly, byfinding user name and password information, or through an ap-plication by using a valid OAuth token [25]. In the case wherecompromised accounts are exploited for financial gains, Thomaset al. [25] found three major schemes, namely: (1) The sale of prod-ucts, e.g., weight loss supplements. (2) The sale of influence in thesocial network, e.g., a compromised account would advertise thesale of followers and simultaneously it would be used to gener-ate these “fake” followers. (3) Lead generation, where users weretricked into filling surveys.

It is common practice in compromised account detection to buildprofiles based on user behaviour and look for anomalies withinthem. Egele at al. [7] learns behavioral profiles of users and looksfor statistical anomalies in features based on temporal, source, text,topic, user, and URL information. Textual features do not considersemantics, since they only detect the change in language (e.g., fromEnglish to French) of messages. Also, topic features are somewhatsimplistic, since they only consider message labels, such as hash-tags, as topics. Ruan et al. [20] performs a study of behavioralpatterns using click-stream data2 on Facebook. From the obser-vations made in the behavioral study, the authors derive featuresthat model public (e.g., posting a photo) and private (e.g., searchingfor user profiles) behavior of users. To find compromised accountsthe authors look at different click-streams associated with a sin-gle user and calculate feature variance. An account is consideredcompromised if the difference of a new click-stream does not fallwithin the variance estimated on the user’s click-stream history.Similar to our work, the authors use data of other users to simulatecompromised accounts. However, their method seems to be tailoredto social media platforms that are similar to Facebook, whereas

1The code is available at: https://github.com/dom-s/comp-account-detect.2In general, it is hard to obtain such click data, since they are usually only accessibleto the social media provides.

our method is general enough that it can be applied to any plat-form that uses natural language text. Viswanath et al. [28] usesprincipal component analysis (PCA) to characterize normal userbehavior, which allows them to distinguish it from abnormal be-havior. Their features do not consider text and it is noticeable thatcompared to other detection categories, their approach performedworst on compromised account detection. Vandam et al. [27] per-forms a study and proposes features for detection of compromisedaccounts on Twitter. The study manually categorizes the contentof compromised accounts and measures certain account charac-teristics, such as number of hashtags or number of mentions intweets. The authors show that these characteristics can be used asfeatures in a classification framework, which improve 9% points inF1 if added to term-based features in a logistic regression classifier.Terms alone make up 32% points in F1, which indicates that text-based features are the most important. Karimi et al. [11] have shownthat Long Short-Term Memory (LSTM) networks can be applied tocapture the temporal dependencies in the compromised accountdetection problem. The work leverages context (i.e., textual) datausing document embeddings, network features derived from theneighboring users in the social graph. The LSTM is leveraged tolearn distinguishing temporal abnormalities in the user account.Adding the textual features to the model seems to bring the highestperformance improvement. However, the experiments do not showthe effectiveness of their method when it is restricted to textualinformation only.

3 THREAT MODELThe adversary’s goal is to inject textual output into a benign ac-count, in order to mask the output’s origin and leverage the user’sinfluence network. In our setting, it is irrelevant how the adversaryobtained access to the account. To make our method as general aspossible, we assume that no account details are observed, meaningthat we ignore the friendship network, profile details, etc. To fur-ther cater to generality, we don’t make any assumptions about thetext, except that it was written by a different author. In the case ofan attack, we assume that at least one message is injected by theadversary and that at least one benign message is observed. For allaccounts we assume that they existed for more than one day.

We define our threat model formally and frame it as a classi-fication problem. Let U be the set of all users u. Further, letmu

tbe a messagem from user u at time t ∈ {1, 2, ...,N }. Our goal isto find all compromised user accounts Ucomp ⊂ U. We turn thecompromised account detection task into a binary classificationproblem, where the goal is to decide whether a user account iscompromised or not. We therefore learn a function comp(u), whichreturns 1 if u ∈ Ucomp and 0 otherwise.

4 APPROACH4.1 General FrameworkWe now describe the proposed framework for identifying compro-mised accounts. In accordance with our threat model, the frame-work is based on the assumption that an adversary’s textual outputwill deviate from a regular user’s textual output. To capture this dis-crepancy between language usage, we propose to divide the tweetspace of a user into two non-overlapping setsMU ser andMAttack .

2

https://github.com/dom-s/comp-account-detect

Figure 1: A tweet stream divided according to begin (tbeдin )and end (tend ) of an attack. A language model is learned forregular tweets (θU ser ) and compromised tweets (θAttack ).

KL-Samples

min.Features

Classifier compromised benign

𝒎𝟏𝒖

User’s Tweet Stream

𝒎𝟐𝒖 𝒎𝟑

𝒖 𝒎𝑵'𝟐𝒖 𝒎𝑵'𝟏

𝒖 𝒎𝑵𝒖

Sample

𝜽𝑨𝒕𝒕𝒂𝒄𝒌

𝜽𝑼𝒔𝒆𝒓

max. mean var.

{ 𝒎𝟐𝒖,𝒎𝑵'𝟐

𝒖 }

{ 𝒎𝟏𝒖,𝒎𝟐

𝒖 , 𝒎𝑵'𝟏𝒖 ,𝒎𝑵

𝒖 }

𝑫𝑲𝑳(𝜽𝑨𝒕𝒕𝒂𝒄𝒌, 𝜽𝑼𝒔𝒆𝒓)

𝑫𝑲𝑳𝟏 𝑫𝑲𝑳𝟐 𝑫𝑲𝑳𝟑 ... 𝑫𝑲𝑳𝟓𝟎𝑫𝑲𝑳𝟒

...

Figure 2: Overview of approach.

We randomly assign two timepoints: tstar t signals the start of the at-tack and tend signals its end. Allmu

t with t ∈ [tstar t , tend ]make upMAttack , whereas allmu

t with t ∈ [t1, tstar t−1] ∪ t ∈ [tend+1, tN ]make up MU ser . We can measure the difference between MU ser

andMAttack using any similarity measure of our choice. This pro-cedure can be repeated multiple times for different values of tstar tand tend . t can be of different granularity, where the minimum isper-message and the maximum can be chosen at will. The moreoften the procedure is repeated for a certain user, the higher isthe sampling rate, which makes for better approximations of thetrue difference betweenMU ser andMAttack (we study the optimalsample rate in Section 6). Thus, this strategy provides a flexibletrade-off between accuracy and efficiency. The similarity measurescan then be employed as features in the downstream task.

4.2 Instantiation with Language ModelingWe describe our practical instantiation of the framework usinglanguage modeling and supervised learning. We create a classifierthat distinguishes compromised from benign user accounts basedon a feature set, derived from our framework. We measure this asthe difference between two word probability distributions of a userand an attacker. We select KL-divergence [14] as our method ofchoice to compare the difference of probability distributions. Weassume that when a user writes a message she draws words froma probabilistic distribution that is significantly different from thedistribution an attacker draws words from. Let θU ser and θAttackbe two word probability distributions (i.e., language models) for theuser and attacker. We need to select two time points tstar t and tend

that mark beginning and end of the attack (Figure 1). As describedby our framework, all messagesmu

t that fall within the time interval[tstar t , tend ] contribute to θAttack , whereas mu

t < [tstar t , tend ]contribute to θU ser .

Figure 2 presents an overview of our approach. For a particularuser, we sample her tweet stream for random tstar t , tend pairs. Asseen in the figure, the algorithm can, for example, select t3 andtN−2 as tstar t and tend , respectively. Thus, all tweets that fall be-tween {[mu

1 ,mu2 ], [m

uN−1,m

uN ]} contribute to θU ser . All tweets that

fall in [mu3 ,m

uN−2] contribute to θ

Attack . Then, the KL-divergencefor these specific θU ser and θAttack contributes one sample forthis user (D50

KL in the figure). This process is repeated for differ-ent samples (e.g., we used 50 samples in our experiments), whereeach time tstar t and tend are selected at random, with constrainttstar t < tend . Sample rates are selected empirically, depending onthe best classification performance on a development set. We selectthe maximum, minimum, mean and variance of the sampled KL-divergence scores. These features are then combined in a SupportVector Machine (SVM) model, which learns the optimal weightingfor each feature based on the training data. Naturally, other classi-fiers can also be used, but exploration of different classifiers is outof the scope of this paper.

4.3 Language Modeling DetailsWe try to estimate the joint probability of P(w1,w2,w3, ...,wN ) forall wordsw1, ...,wN in the text. According to the chain-rule, this isequivalent to computing

P(w1)P(w2 |w1)P(w3 |w1,w2)...P(wN |w1, ...,wN−1).

Because of the combinational explosion of word sequences and theextensive amount of data needed to estimate such a model, it is com-mon to use n-gram language models. N-gram language models arebased on the Markov assumption that a wordw1 only depends onn previous words (P(wk |w1, ...,wk−1) ≈ P(wk |wk−n+1, ...,wk−1)).The simplest and computationally least expensive case is the uni-gram. Here, the Markov assumption is that a wordwk is indepen-dent of any previous word, i.e., P(wk |w1, ...,wk−1) ≈ P(wk ). AllP(wk ) for k ∈ 1, 2, ...,N make up a language model θ , which is amultinomial probability distribution, where words in the documentare events in the probability space. The parameters for the languagemodels θ are estimated using maximum-likelihood. Maximizing thelikelihood of the uni-gram language model is equivalent to countingthe number of occurrences of wk and dividing by the total wordcount (P(wk ) =

c(wk )N , where c(wk ) is the word count ofwk ). This

distance between two probability distributions can be estimatedusing Kullback-Leibler-divergence [14]. In the discrete case, for twomultinomial probability distributions P and Q the KL-divergence isgiven as

DKL(P ,Q) =∑iP(i)loд( P(i)

Q(i) ).

It can be observed that DKL(P ,Q) , DKL(Q, P). However, it is com-mon practice to still think of DKL as a distance measure betweentwo probability distributions [34]. An issue with DKL is that thesum runs over i , which is the event space of P and Q . Thus, it re-quires the event space to be equivalent for both distributions. In

3

our case θ is a language model. Now, let v(θ ) denote the vocab-ulary set of θ . As a result of maximum-likelihood estimation, inmost cases v(θU ser ) , v(θAttack ). Thus, we have to smooth theprobability distributions such that v(θU ser ) = v(θAttack ), whichis required in order to calculate DKL . To achieve this, we definev(θ ) := v(θU ser ) ∪ v(θAttack ). Then, we set v(θU ser ) = v(θ )and v(θAttack ) = v(θ ) and estimate θU ser and θAttack using theLaplace estimate.

5 EXPERIMENTAL DESIGNThis section discusses our research questions that we answer exper-imentally. We first perform a feasibility analysis to find evidencethat (I) compromised user accounts do exhibit higher difference inlanguage models compared to benign accounts. To show this wecalculate KL-divergence for all possible combinations of tbeдin andtend for a subset of users and manually investigate the characteris-tics of these accounts. The second part of our feasibility analysisfinds proof that (II) the average KL-divergence can be estimatedby randomly sampling a certain number of points with differentbegin/end dates. For this we leverage the same data as in our firstfeasibility study and plot, for different sample rates, the estimatedaverage KL-divergence and mean squared error, compared to theactual averages. Following the feasibility study, we show (III) howeffective the proposed language model based method is in detectingcompromised accounts in a simulated environment; (IV) how theproposed features compare to general text classification featuresand how to combine them; (V) that our proposed model outper-forms state-of-the-art compromised account detection methodsand can further improve performance by combining their featurespaces; (VI) how effective is our method on a real (non-simulated)dataset by performing a qualitative, manual investigation. We startby introducing our dataset.

5.1 DatasetPrevious methods on compromised account detection either donot publish their datasets [7, 27, 28] or the original text data is notfully recoverable due to restrictions of the social media platform orthe deletion of user data [11]. Therefore, meaningful evaluation isespecially challenging in our setting. To circumvent this problem ithas become common that account attacks are simulated [9, 20, 26].In similar fashion, we perform a simulation of account attacks,which allows us to compare different methods quantitatively tostudy their effectiveness. Any bias introduced by the simulation isunlikely to affect our conclusions as it is orthogonal to the methodsthat we study and we argue that such an evaluation strategy isadequate to perform a fair comparison of different methods.

For simulation, we leverage a large Twitter corpus from Yangand Leskovec [32]. The dataset contains continuous tweet streamsof users. Relationships between users, etc. are unknown. Findingcompromised accounts manually within a dataset of 20 millionusers is clearly infeasible. Simulating account hijackings enables usto (cheaply) create a gold standard while having full control overthe amount of accounts compromised and begin/end of accounttakeovers. For simulation, we follow our threat model introducedin Section 3, where the adversary’s goal is to inject messages into

regular accounts. Since we make no assumptions about the tex-tual content, we decided to replace part of the consecutive tweetstream of an account, with the same amount of consecutive tweetsfrom another random user account. Since our algorithm has noknowledge about the begin and end of an attack, we choose theseat random, meanwhile ensuring that only a certain pre-definedfraction of the tweets is compromised. This allows us to test theeffectiveness of our method in different scenarios, where differentpercentages of tweets are compromised within an account. To getmeaningful estimates for the language models θU ser and θAttack ,we select the 100,000 users with most tweets in our data. Accordingto our threat model, we retain users with more than one day ofcoverage, resulting in 99,912 users and 129,442,760 tweets.

5.2 Feasibility StudiesIn our feasibility studies we use a subset of the data by selecting495 users at random, where each compromised account contains50% compromised tweets. For each user, we put all tweets into dailybuckets and calculate the KL-divergence for all possible combina-tions of tbeдin and tend . The reason this cannot be done for thewhole dataset is simply because of computational cost. We thereforeoperate on this subset to investigate whether our method is feasibleand can be approximated to avoid increased computational effort.

5.3 Quantitative EvaluationOur quantitative evaluation aims to show the performance of thealgorithm using standard metrics for performance measure. Wemeasure classification Accuracy; Precision, Recall and F1-score forthe class representing the predictions of compromised accounts(1-labels). In addition, we show effectiveness of our method for dif-ferent levels of difficulty. We experiment with various settings forthe percentage of compromised tweets in compromised accountswith random begin and end dates of account take over. More con-cretely, we select 50%, 25%, 10% and 5% of tweets to be compromised,each representing a more difficult scenario. As these ratios wouldnot be strictly separated in a real-world scenario, we further experi-ment with random (“RND”) ratios, drawn uniformly from [5%,50%].In our experiments the probability of an account being compro-mised is set to 0.5 to obtain a balanced dataset. Furthermore, weemploy ten-fold cross validation.

5.4 BaselinesWe choose to compare our proposed features, based on languagemodeling (LM), to general features that have been proposed for textclassification (i.e., word-based and Doc2Vec [15]) and task-specificfeatures from two state-of-the-art methods in compromised accountdetection, namely COMPA [7] and VanDam [27]. We combine themin a Support Vector Machine (SVM) framework, which has beenshown to be an excellent predictor for text classification. However,our approach is general and can be applied to any classifier.SVM. We use SVM [5] with linear kernels for all models. Featuresare standardized by removing the mean and scaling to unit variance.Ten-fold cross validation is performed for all models to ensure thereis minimal bias in selecting the training/testing split of the data. Allposts of a single user are concatenated as one document. The labels

4

for each document were chosen according to our problem definitionin Section 3; 1 if an account is compromised and 0 otherwise.Word-based Features. For word-based feature representation andfeature selection we follow the methodology of Wang et al. [29]. Wechoose count based (COUNT ) and TF*IDF [24] based (TF*IDF ) rep-resentations of words and experiment with two supervised featureselection strategies, namely chi-square and mutual information. Fordictionary creation we keep 100,000 uni-grams, which appear in noless than 20 and no more than 9991 (=10%) of documents, to removevery rare and very common words.Document-based features (Doc2Vec). We leverage the methodproposed in Le and Mikolov [15] to learn low-dimensional vectorrepresentations for documents. As recommended, we choose a vo-cabulary size of 2M tokens [19], 400-dimensional document vectorsand a window size of 5 [15]. We make use of the document vectorsas features in our SVM classification framework.COMPA [7]. This method builds behavioral user profiles and de-tects compromised accounts by finding statistical anomalies inthe features. We implement features that can be derived from ourdataset: time (hour of day), message text (language), message topic,links in messages and direct user interaction. This method is usedfor stream processing; once a user profile is build, themethod checksa new message against the existing user profile. For each featurethe message will be assigned an anomaly score, which reflects howdifferent this message is from the previously observed user profile.Since our method classifies accounts as a whole, we simulate thisprocess for each message within an account (in temporal order ofthe posting of the message). We then aggregate the anomaly scores,per feature, as the mean of the scores of all the account’s messages.VanDam [27]. This work investigated the distributional propertiesof different features on compromised accounts and uses them ina classification framework. We implement the following featuresfrom this method: hashtags, mentions, URLs, retweets and senti-ment. The authors use these features to classify single messages ascompromised. In our setting, we aggregate each feature’s countsover all messages within an account and use the mean of the countsas features.

5.5 Qualitative EvaluationOur qualitative evaluation aims to show how well compromisedaccounts can be detected in the original dataset. Therefore, wemanually evaluate the accounts that have the highest probabilityaccording to our classifier. Here, a major change is that we onlyinject messages into the training dataset, whereas the testing setis left untouched. We randomly select 70% of users for trainingand 30% of users for testing, with 25% of tweets compromised.Since there is no ground truth in the testing dataset, we definefour specific metrics to evaluate whether an account is consideredcompromised. If two or more of the following metrics are met,an account is considered as being compromised: In certain timeperiods the account (1) has a sharp topic change, (2) has a specificposting frequency change, (3) has a specific language change (4)posts repeated tweets.

6 EXPERIMENT RESULTS6.1 Language Model Similarity of

Compromised AccountsIn the first part of our feasibility analysis we answer research ques-tion (I) and examine whether any significant differences in KL-divergence between benign and compromised user accounts exist.

To achieve this, we manually inspect the differences in user ac-counts by leveraging a heatmap as a visual cue. Figure 3 depictsfour heatmaps of two benign (Figures 3a, 3b) and compromiseduser accounts (Figures 3c, 3d). The x-axis of each figure depictsdifferent values for tbeдin , while the y-axis depicts different valuesfor tend . The color palette ranges from blue (low KL-divergence) tored (high KL-divergence). The left part of the plot below the diago-nal is intentionally left empty, since these values would representnonsensical variable assignments for tbeдin and tend .

It is immediately evident that we find more high KL-divergencevalues for compromised compared to benign accounts. In the figures,high values are expressed by red and dark red colors. From thisobservation it can be inferred that the average KL-divergencewill behigher for accounts that are compromised. By manually inspectingover 100 of these user account heatmaps we find that many ofthem follow this general trend. We understand this as preliminaryevidence that a method which utilizes KL-divergence for detectingcompromised accounts is feasible.

We further noticed in Figures 3c and 3d that the account takeoverhappened where the KL-divergence reaches its maximum (see thedark circle in Figure 3c and tip of the pyramid in Figure 3d). Un-fortunately, this is not true for all inspected accounts, but it givesreason to believe that there might be potential to find the mostlikely period of an account takeover with our current framework.This could be done by finding tbeдin and tend that maximizes thedifference of the language models θU ser and θAttack .

6.2 Estimate KL-divergence Using RandomSampling

In our second feasibility analysis we investigate research ques-tion (II), whether the average KL-divergence of a user account canbe approximated using random sampling. More specifically, we tryto find a reasonable estimate by calculating the KL-divergencesonly for a subset of the tbeдin and tend pairs.

Since we have calculated the KL-divergence for every possiblecombination of tbeдin and tend for all of the 495 users, we know theactual average KL-divergence. We then try different sampling rates.For every sampling rate we average over the samples and comparethem to the actual average that is calculated using all tbeдin , tendpairs. The result is shown in Figure 4. In the figure we plot the actualaverage KL-divergence for compromised and benign against theaveraged samples for different sample rates. The sample rates rangefrom 1 to 121. The plot shows that the average KL-divergence forcompromised accounts is about 0.1 higher than for benign accounts.This confirms our findings in Section 6.1. Furthermore we see thatfor small sample rates (< 81) there are minimal deviations forthe average (±0.01). The higher the sample rate the lower thesedeviations become, as our estimate gets better.

5

(a) Benign user account (b) Benign user account

(c) Compromised user account (d) Compromised user account

Figure 3: KL-divergence heatmap for different benign (Figure 3a and 3b) and compromised user accounts (Figure 3c and 3d).The x-axis of each figure depicts different values for tbeдin , while the y-axis depicts different values for tend . The color paletteranges from blue (low KL-divergence) to red (high KL-divergence).

Since our estimates only deviate slightly we also investigate themean squared error (mse) for each sampling rate. Here, the mse isdefined as:

1n

∑u ∈U test

(sampled_avд(u) − actual_avд(u))2

In Figure 5 we plot the mse for compromised and benign accounts.For very small sampling rates (< 50) we see errors of over 0.07and over 0.06 for compromised and benign accounts, respectively.Once the sample rate is greater than 101 the mse is close to 0. Wetherefore conclude that a sampling rate in the range of [50, 100] issufficient for our experiments.

6.3 Effectiveness on Simulated DataIn this subsection we show (III), the effectiveness of our languagemodel based method when detecting compromised accounts in a

simulated environment, (IV), compare the performance to othertext classification features and (V), show that improvements canbe made when combining the language model features with otherspecialized approaches.

To evaluate research question (III), we perform an ablation studyin Table 1, which shows the performance of all features together andindividually. In this experiment 50% of the tweets of a compromisedaccount are compromised. We observe that we gain maximum per-formance over all metrics when all features are utilized (see column“All”). The classifier reaches an accuracy of 0.80, high precision(0.90) and Recall at 0.68. Both measures are combined in the F1-score, which reaches 0.78. We would like to note that a precisionof 0.90 can be sufficient for many practical applications, wherethe system could alert users or platform providers. When featuresare investigated individually, we find that the features “Max” and“Variance” perform worst. The features “Min” and “Mean” perform

6

0.14

0.16

0.18

0.2

0.22

0.24

0.26

0.28

1 11 21 31 41 51 61 71 81 91 101 111 121

avg(

KL-

div)

sample rate

sample compromisedsample benigncompromised

benign

Figure 4: Average KL-divergence for different sample ratesin interval [1, 121].

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

1 11 21 31 41 51 61 71 81 91 101 111 121

mse

sample rate

compromised

benign

Figure 5: Mean squared error (mse) for different samplerates in interval [1, 121].

Table 1: Ablation study using different measures.

Measure All Max Min Mean Var.Accuracy 0.80 0.59 0.76 0.75 0.48

F1 0.78 0.57 0.72 0.72 0.35Precision 0.90 0.61 0.87 0.83 0.47Recall 0.68 0.53 0.61 0.64 0.27

almost equally in isolation and much better than the other two. Wefind the minimum sample of the two probability distributions tobe a strong signal. This further confirms our earlier observationsthat compromised accounts can be distinguished by their higheraverage KL-divergence.

We turn to research question (IV) and investigate how our lan-guagemodel based approach compares to text classification features.

Table 2 lists the results for our approach (LM) and Doc2Vec. Bestperformance is reached when 50% of tweets are compromised, withaccuracy at 0.80 and 0.71, respectively. When the number of com-promised tweets is reduced by half, the accuracy drops to 0.75 and0.69. When the amount of compromised tweets is set to 10% and 5%,accuracy further reduces for both approaches. For random ratios,the performance is slightly lower than for the fixed 25% ratio. It isexpected that this performance falls somewhere within the lower(5%) and upper (50%) bounds for performance. Summarizing, we seethat LM achieves higher Accuracy and Precision, whereas Doc2Vecachieves higher Recall.

In Table 3 we compare LM to general text classification featuresand their combinations. It can be seen that Doc2Vec performs betterthan the word based models. This might be due to the fact thatDoc2Vec is able to learn a better representation for each document,compared to TF*IDF and COUNT . Doc2Vec outperforms the TF*IDFmodel by up to 15 percentage points (≈27% relative improvement),if the amount of compromised tweets reaches 50%. The performanceimprovement is less drastic when only small amounts of tweetsare compromised. We further find that LM performs best, as itoutperforms TF*IDF and Doc2Vec with 6 and 7 percentage pointswhen only 5 percent of tweets are compromised, respectively. Themost distinguishing performance is achieved when the amount ofcompromised tweets reaches 50%. There, LM outperforms Doc2Vecwith 9 percentage points (≈13% relative improvement) and TF*IDFwith 24 percentage points (≈43% relative improvement). We arguethat this is strong evidence for the superiority of the LM approachcompared to other text classification features.

When combining features, we find that adding Doc2Vec featuresto LM results in the highest performance improvements, with upto 7 percentage points over LM alone (column: “LM + Doc2Vec”).Adding TF*IDF to LM or Doc2Vec does only have a negligible ef-fect (columns: “LM + TF*IDF” and “Doc2Vec + TF*IDF”). The sameminimal improvements are observed when adding TF*IDF to LMand Doc2Vec (column: “all”). Thus, we conclude that the proposedLM features add meaningful signals to general text classificationfeatures.

6.4 Comparison with Existing DetectionMethods

For answering research question (V), we show how LM outper-forms state-of-the-art detection methods and that improvementscan be made by combining LM with these methods. The baselinefeatures here differ in the way that they are specifically designedfor compromised account detection, whereas the previous featureswere general textual representations.

We compare all models in Table 4. The first three rows comparethe compromised account detection features on their own. Ourmodel outperforms the strongest baseline (COMPA) over all metrics,with the highest gains made in accuracy and precision, increasing19.4% and 26.6%, respectively. The fifth through seventh row showthe combination of the LM features with the two baselines alone andin combination. Here we find that our method can further improvewhen the baseline features are added. We argue that our featuresare orthogonal to existing work, as evidenced by the performance

7

Table 2: LM and Doc2Vec features, with different percentages of compromised tweets.

LM Doc2Vec% Accuracy F1 Precision Recall Accuracy F1 Precision Recall50 0.80 0.78 0.90 0.68 0.71 0.71 0.72 0.7125 0.75 0.71 0.82 0.63 0.69 0.69 0.70 0.6910 0.65 0.62 0.69 0.56 0.63 0.63 0.64 0.635 0.59 0.56 0.60 0.52 0.59 0.59 0.59 0.59

RND 0.75 0.70 0.81 0.61 0.68 0.68 0.69 0.68

Table 3: Accuracy for different features and their combinations. Our method (LM) is compared to general text representations.

% COUNT TF*IDF Doc2Vec Doc2Vec +TF*IDF

LM LM +TF*IDF

LM +Doc2Vec

all

50 0.53 0.56 0.71 0.72 0.80 0.81 0.87 0.8725 0.53 0.55 0.69 0.69 0.75 0.75 0.82 0.8210 0.52 0.54 0.63 0.63 0.65 0.65 0.70 0.715 0.52 0.53 0.59 0.59 0.59 0.59 0.62 0.62

RND 0.53 0.55 0.68 0.68 0.74 0.74 0.80 0.80

improvement due to combination of the feature spaces. The second-to-last row shows the performance, when all baseline features arecombined with the LM model. This results in the best performingmodel, with gains over all metrics.

6.5 Effectiveness on Real DataThe final question we answer experimentally is (VI), how effectiveis our method on a real (non-synthetic) dataset. In these experi-ments the classifier is trained on synthetic data and evaluated on“untouched” Twitter data. Using this setup we show that it is possi-ble to train an detection algorithm with no human labeling effortrequired.

In our first step we categorize k = 20 accounts with the high-est probability of being compromised into six categories { news,advertisement/spam, re-tweet bot, compromised, regular users, un-known }. The results can be found in Table 5. We find that most ofthese accounts belong to categories with high variation in language,i.e., news, advertisement/spam, re-tweet bot. One of the accounts isfound to be compromised. The algorithm also misclassified sevenuser accounts as being compromised. One account was categorizedas “unknown”, since it was posting in a language that none of theevaluators speaks. It is reasonable to assume that if our algorithmwere applied at much larger scale to all the Twitter users, it wouldmost likely be able to detect many more compromised accounts.Furthermore, these results show that our algorithm can also be usedto detect “unusual” accounts and users, thus potentially enablingdevelopment of novel text mining algorithms for analyzing userbehavior on social media, which should be a very interesting futureresearch topic.

Next, we inspect the current state of each account and found atotal of five different states (i.e., abandoned, active, deleted, tweetsprotected and suspended). Table 6 lists the results. We find thatmost accounts are abandoned. Noteworthy is also that one of theaccounts was suspended by Twitter. The account that was identifiedas compromised was set to “tweets protected”, which could be an

indicator that said user had become more conscious about tweetsthat were posted from her account and therefore decided to notshare her tweets publicly.

6.6 Compromised Account ProfilingWe now further investigate the compromised account and discussthe details which made us believe that a compromise had hap-pened. While manually investigating the account’s tweets. Theywere mostly discussing celebrities, movies and music, with veryfew links in any messages. At some point in the tweet stream anear-duplicate message is posted hundreds of times with only briefpauses between tweets. The pattern that all tweets had in commonis shown in Figure 6. Here, @{username} were different users, thatwere most likely followers of the account. With this scheme thehacker tries to directly grab the attention of a targeted user. Theplaceholder {link}, was one of two links that were identified as sus-picious by the link-shortening service the hacker utilized to hidethe actual URL. Since both links do not exist anymore, we cannotprovide additional details about the target of these links. After this“attack” of messages, the tweets return to a similar pattern as before.

From our standpoint we conclude that this account was compro-mised for the following reasons:

(1) The account was discussing different topics before the attack.(2) The account was posting few links in messages.(3) A near-duplicate message with suspicious intend and links

was repeatedly posted hundreds of times at a certain point.(4) These messages stop and the account continues with regular

activity.

From the content of this messages we conclude that the hackerwas pursuing a led generation scheme [25], where users are luredinto clicking a link that will offer some sort of payment for theirinformation.

8

Table 4: Comparison to features from related methods. The dataset has random percentages of tweets compromised (RND).

Model Accuracy F1 Precision RecallCOMPA [7] 0.62 0.60 0.64 0.56VanDam [27] 0.50 0.47 0.50 0.45LM 0.74 0.70 0.81 0.61

improvement LM over best baseline 19.4% 16.7% 26.6% 8.9%LM + COMPA 0.75 0.73 0.81 0.66LM + VanDam 0.74 0.71 0.82 0.62LM + COMPA + VanDam 0.76 0.73 0.81 0.67

improvement over LM 2.7% 4.3% 1.2% 9.8%LM + Doc2Vec + TF*IDF + COMPA + VanDam 0.81 0.79 0.85 0.75

improvement when adding standard features 6.6% 8.2% 4.9% 11.9%

Table 5: Category assignment of manually evaluated ac-counts.

Category CountNews 5

Advertisement/Spam 4Re-tweet Bot 2Compromised 1Regular User 7Unknown 1

Table 6: Account status of manually evaluated accounts.

Account Status % CountAbandoned 7

Active 6Deleted 4

Tweets protected 2Suspended 1

Pattern: @{username} as discuss [sic] earlier heres [sic] thelink, another service that will pay $ for your tweets, anyonecan join {link}Profit scheme: Lead generationCurrent status: Tweets protected

Figure 6: Profiling of compromised account.

6.7 Error AnalysisFrom Table 5, we can see that five news accounts, four advertise-ment/spam accounts and two re-tweet bots are misclassified ascompromised accounts. We meet this problem because our methoduses a similarity function, in this case the KL-divergence, as themea-surement to determine whether an account is being compromised.One common characteristic of news accounts and advertisementaccounts is that they rarely use the same words or sentence patternsbecause they discuss different news and products. Thus, the highlanguage diversity in these accounts results in the KL-divergence

between any two time periods to be high, which leads to the mis-classification.

A potential way to improve the method would be to order thesamples chronologically instead of treating them as a set. An algo-rithm could then look for extreme changes in KL-divergence, whichshould mostly occur in compromised accounts and less in news orspam accounts, where the KL-divergence is constantly high.

The regular user accounts do not exhibit any obvious patternfor the misclassification. We found that some of these misclassifiedaccounts do exhibit one or more of the following characteristicsthat could potentially influence the KL-divergence samples:

(1) excessive usage of re-tweets,(2) lots of messages with distinct links,(3) lots of repeated messages,(4) many misspellings of the same word.

Further exploration of this is another very interesting future re-search direction.

7 CONCLUSION AND FUTUREWORKWhile humans benefit form theweb inmanyways, there are also sig-nificant risks associated with it. We addressed one of the prominentproblems in social networks, a major part of the web connected tomany people’s lives. We proposed a novel general framework basedon semantic text analysis for detecting compromised social mediaaccounts. Following the framework, we proposed a specific instan-tiation based on uni-gram language models and KL-divergencemeasure, and designed features accordingly for use in a classifierthat can distinguish compromised from benign accounts. We con-clude that:

(1) The proposed LM feature is most effective, even when usedas a single feature-based detection method, outperformingall baseline methods.

(2) LM captures new signals that haven’t been captured in theexisting methods and features, which is shown by the furtherimprovement when added on top of the baselines.

(3) The best performing method would combine the proposedLM with all the existing features.

(4) We find evidence that a detection algorithm can be trainedwithout any human labeling effort, when the account hack-ing is simulated.

9

Although LM is motivated by a security problem, our general ideaof performing differential semantic analysis of text data may beapplicable to other domains where incohesion (or outlier) in textdata needs to be captured. Further exploration of such potentialdirections is interesting future work.

Our work has demonstrated the feasibility of leveraging naturallanguage processing and semantic analysis to address security-related problems on social media. The essence of our idea is toleverage language cohesion and incohesion to infer latent activitiesthat we cannot directly observe. This idea may be applicable tosolve other problems in the Web domain, such as:

• Anomalous topic access [9]: Legitimate topic access could becontrasted to illegitimate access by looking at the differencesin the text of the accessed documents.

• Promotional-infection detection [16]: Infected web pagescould be contrasted to benign pages by looking at the textcontained in these pages.

• Detection of phishing attacks [13]: phishing emails could becontrasted to regular emails by looking at the email content.

These are all interesting directions for further exploration in the fu-ture. Furthermore, the framework developed in this work could beimplemented as part of an intelligent agent that detects bad behav-ior online. The agent could combine our technology for detectingcompromised accounts with other work on detecting social/spambots [6], fake likes [22], rumors [31], misinformation [23], cyber-bullying [3], etc. We envision that the agent would also interactwith human experts to get supervision, when needed.

REFERENCES[1] Kayode Sakariyah Adewole, Nor Badrul Anuar, Amirrudin Kamsin, Kasturi Dewi

Varathan, and Syed Abdul Razak. 2017. Malicious accounts: dark of the socialnetworks. Netw. Comp. Appl. (2017).

[2] Massimo Calabresi. 2017. Inside RussiaâĂŹs Social Media War on America.http://time.com/4783932/inside-russia-social-media-war-america/

[3] Charalampos Chelmis and Mengfan Yao. 2019. Minority Report: CyberbullyingPrediction on Instagram. In ACM Conference on Web Science.

[4] Chris Ciaccia. 2017. McDonald’s Twitter account hacked, blasts Trump. http://www.foxnews.com/tech/2017/03/16/mcdonalds-twitter-account-blasts-trump.html

[5] Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machinelearning (1995).

[6] Stefano Cresci, Marinella Petrocchi, Angelo Spognardi, and Stefano Tognazzi.2019. Better safe than sorry: an adversarial approach to improve social botdetection. In ACM Conference on Web Science.

[7] Manuel Egele, Gianluca Stringhini, Christopher Krügel, and Giovanni Vigna.2013. COMPA: Detecting Compromised Accounts on Social Networks. In NDSS.

[8] Chris Grier, Kurt Thomas, Vern Paxson, and Chao Michael Zhang. 2010. @spam:the underground on 140 characters or less. In CCS.

[9] Siddharth Gupta, Casey Hanson, Carl A Gunter, Mario Frank, David Liebovitz,and Bradley Malin. 2013. Modeling and detecting anomalous topic access. InIEEE International Conference on Intelligence and Security Informatics.

[10] Zaobo He, Zhipeng Cai, Jiguo Yu, Xiaoming Wang, Yunchuan Sun, and YingshuLi. 2016. Cost-efficient strategies for restraining rumor spreading in mobile socialnetworks. IEEE Transactions on Vehicular Technology 66, 3 (2016).

[11] Hamid Karimi, Courtland VanDam, Liyang Ye, and Jiliang Tang. 2018. End-to-EndCompromised Account Detection. In ASONAM.

[12] Heather Kelly. 2013. Twitter hacked; 250,000 accounts affected. http://www.cnn.com/2013/02/01/tech/social-media/twitter-hacked

[13] Mahmoud Khonji, Youssef Iraqi, and Andrew Jones. 2013. Phishing detection: aliterature survey. IEEE Communications Surveys & Tutorials 15, 4 (2013).

[14] S. Kullback and R. A. Leibler. 1951. On Information and Sufficiency. Ann. Math.Statist. (1951).

[15] Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences anddocuments. In International Conference on Machine Learning.

[16] Xiaojing Liao, Kan Yuan, XiaoFeng Wang, Zhongyu Pei, Hao Yang, Jianjun Chen,Haixin Duan, Kun Du, Eihal Alowaisheq, Sumayah Alrwais, et al. 2016. Seekingnonsense, looking for trouble: Efficient promotional-infection detection throughsemantic inconsistency search. In IEEE Symposium on Security and Privacy (SP).

[17] Juan Martinez-Romo and Lourdes Araujo. 2013. Detecting malicious tweets intrending topics using a statistical analysis of language. Expert Syst. Appl. (2013).

[18] Andrea Park. 2017. ABC News, "Good Morning America" Twitteraccounts hacked, praise Trump. http://multimedia.cbs.com/news/abc-news-good-morning-america-twitter-accounts-hacked-praise-trump/

[19] Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove:Global vectors for word representation. In EMNLP.

[20] Xin Ruan, Zhenyu Wu, Haining Wang, and Sushil Jajodia. 2016. Profiling OnlineSocial Behaviors for Compromised Account Detection. IEEE Trans. InformationForensics and Security (2016).

[21] John Russell. 2017. Prominent Twitter accounts compromised after third-party app Twitter Counter hacked. https://techcrunch.com/2017/03/15/twitter-counter-hacked/

[22] Indira Sen, Anupama Aggarwal, Shiven Mian, Siddharth Singh, PonnurangamKumaraguru, and Anwitaman Datta. 2018. Worth its weight in likes: Towardsdetecting fake likes on instagram. In ACM Conference on Web Science.

[23] Haeseung Seo, Aiping Xiong, and Dongwon Lee. 2019. Trust It or Not: Effects ofMachine-Learning Warnings in Helping Individuals Mitigate Misinformation. InACM Conference on Web Science.

[24] Karen Sparck Jones. 1972. A statistical interpretation of term specificity and itsapplication in retrieval. Journal of documentation (1972).

[25] Kurt Thomas, Frank Li, Chris Grier, and Vern Paxson. 2014. Consequences ofConnectivity: Characterizing Account Hijacking on Twitter. In SIGSAC.

[26] David Trang, Fredrik Johansson, and Magnus Rosell. 2015. Evaluating Algorithmsfor Detection of Compromised Social Media User Accounts. In ENIC.

[27] Courtland VanDam, Jiliang Tang, and Pang-Ning Tan. 2017. Understandingcompromised accounts on Twitter. In WI.

[28] Bimal Viswanath, Muhammad Ahmad Bashir, Mark Crovella, Saikat Guha, Kr-ishna P. Gummadi, Balachander Krishnamurthy, and AlanMislove. 2014. TowardsDetecting Anomalous User Behavior in Online Social Networks. In USENIX.

[29] Yiren Wang, Dominic Seyler, Shubhra Kanti Karmaker Santu, and ChengXiangZhai. 2017. A Study of Feature Construction for Text-based Forecasting of TimeSeries Variables. In CIKM.

[30] Sheng Wen, Mohammad Sayad Haghighi, Chao Chen, Yang Xiang, Wanlei Zhou,and Weijia Jia. 2014. A sword with two edges: Propagation studies on bothpositive and negative information in online social networks. IEEE Trans. Comput.64, 3 (2014).

[31] Liang Wu and Huan Liu. 2019. Debunking Rumors in Social Networks: A TimelyApproach. In ACM Conference on Web Science.

[32] Jaewon Yang and Jure Leskovec. 2011. Patterns of temporal variation in onlinemedia. In WSDM.

[33] Eva Zangerle and Günther Specht. 2014. "Sorry, I was hacked": a classification ofcompromised twitter accounts. In SAC.

[34] ChengXiang Zhai and Sean Massung. 2016. Text Data Management and Analysis:A Practical Introduction to Information Retrieval and Text Mining. Morgan &Claypool.

10

http://time.com/4783932/inside-russia-social-media-war-america/

http://www.foxnews.com/tech/2017/03/16/mcdonalds-twitter-account-blasts-trump.html



http://www.cnn.com/2013/02/01/tech/social-media/twitter-hacked

http://www.cnn.com/2013/02/01/tech/social-media/twitter-hacked

http://multimedia.cbs.com/news/abc-news-good-morning-america-twitter-accounts-hacked-praise-trump/

http://multimedia.cbs.com/news/abc-news-good-morning-america-twitter-accounts-hacked-praise-trump/

https://techcrunch.com/2017/03/15/twitter-counter-hacked/

https://techcrunch.com/2017/03/15/twitter-counter-hacked/

Identifying Compromised Accounts on Social Media Using ... › pdf › 1804.07247.pdf · of social...

Documents

Transcript of Identifying Compromised Accounts on Social Media Using ... › pdf › 1804.07247.pdf · of social...