Research on behavioral targeted advertising in mobile Internet

5
July 2014, 21(Suppl. 1): 1–5 www.sciencedirect.com/science/journal/10058885 http://jcupt.xsw.bupt.cn The Journal of China Universities of Posts and Telecommunications Research on behavioral targeted advertising in mobile Internet LIU Jian-yi 1 ( ), Wang Cong 2 , Zhang Ru 1 , Zhao Xin-yi 1 1. School of Computer, Beijing University of Posts and Telecommunications, Beijing 100876, China 2. School of Software Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China Abstract This paper designs and implements the advertising recommending system based on user behavior under mobile Internet. By analyzing the user’s online behavior, user profile that includes long-term interests and short-term interests can be obtained. In the end of this paper, the similarity between the behavior clusters and the advertisements is calculated, which is used to rank and select the most appropriate advertisements. The experiments show that the system can push the appropriate advertisements according to the user’s different interest Keywords mobile Internet, user behavior, contextual advertising 1 Introduction The official business of 4G and the promulgation of virtual telecom operators licenses, revolutionized mobile internet environment from both a technical and economic aspects. The comprehensive and deep integration of network (telecommunication network, television network, computer network), business (telecommunications services, Internet application, video services), commercial model (free, charge, the third party payment), terminals (mobile and fixed television, telephone, computer and so on) will produce several new business. Mobile advertising for its interactive, large information capacity, wide coverage, low input costs, has became one of the main profit model of mobile internet. Currently, mobile advertising does not consider the user’s interests, users tend to passively accept some advertising contents that they are not interested in, so many users hold offensive attitude to these ads. Only by pushing advertisements based on the analysis of the user’s interest, can we achieve the effect of advertising. It is also the inevitable trend of the development of mobile advertising. Adverting recommended technology include sponsored Received date: 20-06-2014 Corresponding author: LIU Jian-yi, E-mail: [email protected] DOI: 10.1016/S1005-8885(14)60520-3 search, contextual advertising and behavioral targeting [1]. Sponsored search, also known as keyword advertising, by using a search engine to search user’s keywords, it returns advertisement associated with the keyword from the background ads lib. For example, AdSense program of Google, paid ranking service of Baidu and so on. Sponsored search advertising research focused on advertising optimization, advertising click predictionpaid ranking, query words rewriting and other aspects [2–6]. Contextual advertising delivers content related ads on the web page based on user browsing web content [7]. For example, the contextual Match product of Yahoo, the online advertising platform of microsoft service network (MSN). Behavioral targeting is a new form of advertising model. Its purpose is to use the internet browsing behavior of users, providing advertising information to meet the user’s intent. By analyzing the user’s web history records, it accesses valuable behavior of users and deliver related ads, so behavioral targeting can provide ads which content user’s demand [8]. Currently there is not much academic research production of behavioral targeting. In the paper [9–10], user’s click behavior is used in the behavioral targeting. In the paper [11–13], it first classify users by using probabilistic latent semantic analysis, then recommend advertising information based on the click

Transcript of Research on behavioral targeted advertising in mobile Internet

Page 1: Research on behavioral targeted advertising in mobile Internet

July 2014, 21(Suppl. 1): 1–5 www.sciencedirect.com/science/journal/10058885 http://jcupt.xsw.bupt.cn

The Journal of China Universities of Posts and Telecommunications

Research on behavioral targeted advertising in mobile Internet

LIU Jian-yi1 ( ), Wang Cong2, Zhang Ru1, Zhao Xin-yi1

1. School of Computer, Beijing University of Posts and Telecommunications, Beijing 100876, China 2. School of Software Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China

Abstract

This paper designs and implements the advertising recommending system based on user behavior under mobile Internet. By analyzing the user’s online behavior, user profile that includes long-term interests and short-term interests can be obtained. In the end of this paper, the similarity between the behavior clusters and the advertisements is calculated, which is used to rank and select the most appropriate advertisements. The experiments show that the system can push the appropriate advertisements according to the user’s different interest

Keywords mobile Internet, user behavior, contextual advertising

1 Introduction

The official business of 4G and the promulgation of virtual telecom operators licenses, revolutionized mobile internet environment from both a technical and economic aspects. The comprehensive and deep integration of network (telecommunication network, television network, computer network), business (telecommunications services, Internet application, video services), commercial model (free, charge, the third party payment), terminals (mobile and fixed television, telephone, computer and so on) will produce several new business. Mobile advertising for its interactive, large information capacity, wide coverage, low input costs, has became one of the main profit model of mobile internet.

Currently, mobile advertising does not consider the user’s interests, users tend to passively accept some advertising contents that they are not interested in, so many users hold offensive attitude to these ads. Only by pushing advertisements based on the analysis of the user’s interest, can we achieve the effect of advertising. It is also the inevitable trend of the development of mobile advertising.

Adverting recommended technology include sponsored Received date: 20-06-2014 Corresponding author: LIU Jian-yi, E-mail: [email protected] DOI: 10.1016/S1005-8885(14)60520-3

search, contextual advertising and behavioral targeting [1]. Sponsored search, also known as keyword advertising, by using a search engine to search user’s keywords, it returns advertisement associated with the keyword from the background ads lib. For example, AdSense program of Google, paid ranking service of Baidu and so on. Sponsored search advertising research focused on advertising optimization, advertising click prediction、paid ranking, query words rewriting and other aspects [2–6].

Contextual advertising delivers content related ads on the web page based on user browsing web content [7]. For example, the contextual Match product of Yahoo, the online advertising platform of microsoft service network (MSN).

Behavioral targeting is a new form of advertising model. Its purpose is to use the internet browsing behavior of users, providing advertising information to meet the user’s intent. By analyzing the user’s web history records, it accesses valuable behavior of users and deliver related ads, so behavioral targeting can provide ads which content user’s demand [8]. Currently there is not much academic research production of behavioral targeting. In the paper [9–10], user’s click behavior is used in the behavioral targeting. In the paper [11–13], it first classify users by using probabilistic latent semantic analysis, then recommend advertising information based on the click

Page 2: Research on behavioral targeted advertising in mobile Internet

2 The Journal of China Universities of Posts and Telecommunications 2014

behavior of the same user group. In the industry, many commercial systems developed behavior targeting function. For example, Adlink [14] consider the conversation behavior of short-term users, DoubleClick [15] use browsers and user’s operating system to improve the user’s segmentation performance, Specificmedia [16] can predict each user’s interest and their willingness to purchase.

Because of the personal nature of mobile terminal, the acquired data has exact personal characteristics, so it is easier for operators and advertisers to identify users, analyze the behavior of users and find the users’ interest. This paper studies advertising recommending system based on user behavior and push advertisement by analyzing user’s browsing logs and calculating user’s short-term and long-term interest.

2 System architecture

The behavioral targeted advertisement recommendation system is designed to improve the efficiency of adverting by modeling user’s behavior and precisely recommend adverting to users based on user’s behavioral data. System architecture is shown in Fig. 1, including user behavioral analyzing subsystem, advertising subsystem, wireless application protocol (WAP) webpage analyzing subsystem.

Fig. 1 System architecture of behavioural targeted advertising in mobile internet

Above all this subsystem, WAP webpage analyzing system contains webpage analyzing module, segmentation module and webpage classification module. Webpage analyzing system module realizes the function of text extraction and it is used to identify meaningful text of webpage. Webpage analyzing module will eliminate the irrelevant information of WAP webpage, such as webpage navigation links, images, music and other multimedia

information. Webpage analyzing module mainly uses html tag removing technology to extract text page by matching webpage script tag. The output is webpage text content (with title and some key label), the webpage text is processed by segmentation module. The segmentation module uses a forward maximum matching algorithm to split long sentence of the text into individual word and give the part of speech of the word by searching the dictionary. Webpage classification module use Bayesian classification algorithm to accurately identify the theme of web and identify user’s interest content by the theme.

In the user behavioral analyzing subsystem, browsing process and user’s behavior are recorded in the WAP log. The system establishes user’s behavior characteristics model according to the browsing history of WAP log and distinguish use’s long-term and short-term interest by the browsing time. Long-term interest reflects a stable browsing behavior of user in a relatively long period of time, so it is predictable. Short-term interest reflect user’s occasional browsing behavior in a short period of time. For example, a male user browses lots of webpage about Valentine’s Day gift for buying gifts for his wife, it is difficult to predict.

Advertising system use webpage category and user’s behavior characteristics model to make advertising strategy. According to the category of the webpage, the system search the ads lib of this category, then push advertisements similar to user’s interest by search in these advertisements.

3 User behavior analysis algorithm

Psychology study suggests that human memory is divided into long-term memory and short-term memory, corresponding to interest. It is also divided into long-term interest and short-term interest. Long-term interest is a relatively fixed preference of user which obtained by the accumulation in a long period of time, its general performance is that user is interested in something for a long period of time. In the mobile internet it usually presents that user continuously pays attention to certain kind of webpage. Short-term interest is observed by user’s short-term behavior, its performance is that user is interested in some kinds of things in a certain period of time. In the mobile internet it usually presents that the access of some page is suddenly increase and later return to previous level. Short-term is often active and mutable which needs us to take rapid response. Therefore, this paper uses a sliding window method that the user’s current

Page 3: Research on behavioral targeted advertising in mobile Internet

Supplement 1 LIU Jian-yi, et al. / Research on behavioral targeted advertising in mobile Internet 3

interest is presented in the webpage sample of sliding windows. New webpage continuously slides into the window from the left side so as the short-term model updates quickly.

For an access record log of a certain user, suppose the size of a sliding window is K, according to the time sequence it contains k webpage namely 1P , 2 ,P …, kP whose corresponding access time is 1t , 2 ,t …, kt , and the current user interest set is {Topic}:

1) Take the webpage D from far to near according to the time and use Bayesian classification algorithm to classify the webpage. The classification formula is as follows:

{Topic}' {Topic}

Pr( | ) Pr( )( ) arg max Pr( | )

Pr( | ) Pr( )j i

i j ji

T f F iT

f T TH D f D

f T T∈ ∈∈

⎡ ⎤⎢ ⎥= ⎢ ⎥′ ′⎢ ⎥⎣ ⎦

∑ ∑

(1) In this formula, F is the feature set of the classified

webpage D, Pr( | )i jf T is estimated by the sample number of the thj theme jT in the subject sample

collection, dividing the total number of the sample number in jT . Pr( )jT is the frequency of the sample of jT , while Pr( | )if D is estimated by the appearing number of the frequency if in D, dividing the total frequency number of terms in D.

2) If webpage D belongs to a certain theme in the current interest set, then add it into this theme set and update the classified model of this theme.

3) If webpage D does not belong to any theme of the current interest set, then see this webpage as a new interest class and add it into the current interest set.

4) Deal with all the WebPages according to the above steps. When a user visits a new webpage, handle it with the above steps.Slide the window from left to right side and the oldest webpage will slide out of the window, and then remove it from the current set.

After the above steps, it forms several users’ interest theme. Among all of them, there may be some themes which form by the user’s random access. These themes contain less webpage samples and cannot express user’s interest, so they will be filtered as noise. According to the user’s access time order, the importance of different interest theme is different. The importance of the latest short-term interest and the stable long-term interest are greater than the previous short-term interest. The importance of long-term interest depends on the regularity of the user’s access frequency. As to the importance of the

short-term interest theme iw , use the average access time of the webpage samples in the interest theme to measure. For the importance of the long-term interest theme iw , use

the mean square error of the access time of the webpage samples in the interest theme to measure.

When a user browses a webpage, the system first classifies the webpage and identifies the webpage’s category. Then search the ads lib of this category in the system and adopt the user’s interest to calculate in the ads lib, finally obtain the ads that seems the most similar with user’s interest, the formula is as follows:

ijS = iw S( iT , jA ) (2) In this formula, iT is the thi interest, iw is the

weight of this interest and jA is the thj advertisement. We calculate the ( , )S a b by cosine formula and ijS is the similarity of the thi interest and thj ads. In this way, each ad will have several scores, take the maximum score max{ }ijS as the advertising jS . Finally select the

advertisement which is appropriate for the user according to the order from high to low score of jS .

4 Experimental data and analysis

4.1 Dataset

This paper uses the WAP log of a telecom operator which is named by the data. Each parameter in the log is separated by space and each line is a user’s behavior.

Names of specific parameter and interpretation of parameter are as follows:

Table 1 Parameter of WAP log Parameter

name Interpretation of parameter

Date Record the date of the user accessing WAP. Such as:2013-01-01

Time Record the time of the user accessing WAP. Such as: 00:01:01

c-IP Record the IP address of the user accessing WAP. Such as:166.111.143.222

CS-number Record the calling number of the user accessing. This is a encrypt data which identified by the encrypted number.

s-IP Record the server address of the user accessing WAP. Such as: 239.255.255.250

s-port Record the server port of the user accessing WAP. Such as:8080

CS-method Record the category of operating request by the client. Such as: GET,POST

CS-URI-stem Record the WAP webpage address of the accessing user. Such as:wap.sohu.com

CS-status Record status codes of the user accessing WAP. Such as:200,302

CS(user-agent) Record the mobile phone model of the user accessing WAP. Such as:MEIZU M8

Page 4: Research on behavioral targeted advertising in mobile Internet

4 The Journal of China Universities of Posts and Telecommunications 2014

The advertisement data used in this paper is the keyword advertising acquired from Baidu, Google and other search engine containing about 2 000 themes and 10 000 different advertisement information. The experiment data is acquired from about 100 webpage of a user’s accessing record involving a variety of classification. Among these webpage, there are 21 related webpages in category 1 and 22 related webpages in category 2, evenly distribute throughout the first 80 webpages. The webpages in category 3 distribute evenly throughout the first 50 webpages and the webpages in category 4 focus on the latest accessing webpages. Webpages of these 4 categories account for 75% of user’s accessing webpages, other theme webpages are relatively dispersed and handled as noise. The classification and distribution of the webpage is shows in Fig. 2.

Fig. 2 Classification and distribution of the webpage

4.2 Analysis of short-term behavior results

For the short-term behavior of users, the weight of a certain theme of interest depends mainly on the access time of webpage samples in the theme. The closer is the access time, the higher the weight.

The Fig. 3 shows the impact of user’s short-term behavior on the weight of classification, the horizontal axis represents the number of tracking windows and the vertical axis represents the weight changes of the interest themes as the increase of time. When the number of a classification in the window is less than the number of effective webpage, this classification will be filtered out as noise. In the early stage, class 1, class 2 and class 3 distributed evenly in the samples, however the weight distributed unevenly because of the impact of the noise filtering in the tracking windows. In the later stage, the weight decrease because the class 3 webpage exits tracking windows, at the same time, the weight of class 4 increases remarkably because the user concentrates on browsing class 4 in the later stage. Since other classifications do not

achieve the number of effective webpage, they are filtered out as noise.

Fig. 3 Impact of user’s short-term behaviour on the weight of interest theme

In the results of weights, we can find that at some points, for example in the 26–36 window interval, the weight differences is not small between class 1, class 2 and class 3, so the advertisement types can be recommended according to the weight value.

4.3 Analyses of short-term behavior results

For the short-term behavior of users, the weight of a certain theme of interest depends mainly on the degree of dispersion of webpage samples in the theme, the evenly the distribution of webpage, the more regular the browsing frequent of the user and the weight is also higher.

The Fig. 4 shows the impact of user’s long-term behavior on the weight of interest theme, the horizontal axis represents the number of tracking windows and the vertical axis represents the weight changes of each classification as the increase of time. When the number of a classification in the window is less than the number of effective webpage, this classification will be filtered out as noise. As for class 1 and class 2, because their samples distribute evenly in the early part, the weight of long-term behavior is high. As for class 3, although it distributes evenly in the early part of samples, it does not distribute evenly than class 1 and class 2 in the later part, so the weight of class 3 is less than that of class 1 and class 2. At the same time, the weight of class 4 increases remarkably because the user concentrates on browsing class 4 in the later stage, but it does not continue so the weight of class 3 is less than class 1 and class 2. Since other classifications do not achieve the number of effective webpage, they are filtered out as noise.

Therefore, class 1 and class 2 can represent long-term

Page 5: Research on behavioral targeted advertising in mobile Internet

Supplement 1 LIU Jian-yi, et al. / Research on behavioral targeted advertising in mobile Internet 5

interest of user, so when pushing ads for this user, ads related class 1 and class 2 can be push firstly.

Fig. 4 Impact of long-term behaviour of user on weight of interest theme

5 Conclusions

This paper designs and implements the advertising recommending system based on user behavior under mobile Internet. This system contains user’s behavioral analyzing subsystem、advertising subsystem and WAP webpage analyzing subsystem. By presenting a user’s behavior characteristics analysis algorithms, user profile that includes long-term interests and short-term interests can be obtained. And in the end the similarity between the behavior clusters and the advertisements is calculated, which is used to rank and select the most appropriate advertisements. The experiments show that the system can push the appropriate advertisements according to the user’s different interest.

Meanwhile, the accuracy of the above algorithm is still to be improved. The proposed algorithm still needs for stability testing, verification improvement, so as to ensure the accuracy and stability of the algorithm.

Acknowledgements

This work was supported by the Beijing Higher Education Young Elite Teacher Project (YETP0448), National Key Technology Research and Development Program (2012BAH08B02), Beijing

Municipal Science and Technology Project (Z131100001113034), the Specialized Research Fund for the Doctoral Program of Higher Education (2013114), the Hi-Tech Research and Development Program of China (2012AA012606), the Digital Right Management Technology Research and Development Project (1681300000119).

References

1. Broder A, Fontoura M, Josifovski V, et al. A semantic approach to contextual advertising. Proceedings of the 30th International Conference on SIGIR , Amsterdam, 2007: 559−566

2. Anastasakos T, Hillard D, Kshetramade S, et al. A collaborative filtering approach to ad recommendation using the query-ad click graph. Proceedings on CIKM, 2009: 1927−1930

3. Attenberg J, Pandey S, Suel T. Modeling and predicting user behavior in sponsored search. Proceedings on KDD, 2009: 1067−1076

4. Hillard D, Schroedl S, Manavoglu E, et al. Leggetter. Improving ad relevance in sponsored search. Proceedings on WSDM, 2010: 361−370

5. Zhang W, He X, Rey B, et al. Query rewriting using active learning for sponsored search. Proceedings on SIGIR, 2007: 853−854

6. Zhang W, Jones R. Comparing click logs and editorial labels for training query rewriting. Proceedings on WWW Workshop on Query Log Analysis, Social and Technological Challenges, 2007

7. Shi S C, Cehng T, Wang X, et al. Advertisement-Promotion Research Based on the Content of Webpage. Journal of Chinese information processing, 2007. 21(4): 42−47

8. Yu S P, Chen G. An efficient algorithm of behavior-targeted advertising. Computer Applications and Software, 2011, 28(4): 4−7

9. Chakrabarti D, Agarwal D, Josifovski V. Contextual advertising by combining relevance with click feedback. WWW 2008

10. Li T, Liu N, Yan J, Wang G, et al. A Markov chain model for integrating behavioral targeting into contextual advertising. In Proceedings of KDD Workshop on Data Mining and Audience Intelligence for Advertising. 2009: 1−9

11. Wu X H, Yan J, Liu N, et al. Probabilistic latent semantic user segmentation for behavioral targeted advertising. KDD Workshop on Data Mining and Audience Intelligence for Advertising, 2009: 10−17

12. Yan J, Liu M, Wang G, et al. How much the Behavioral targeting can help online advertising? In Proceeding of WWW’09, 2009: 261−270

13. Ahmed A, Low Y, Aly M, et al.Scalable distributed inference of dynamic user interests for behavioral targeting. Proceedings of the l7th ACM SIGKIDD International Conference on Knowledge Discovery and Data Mining, 2011

14. Adlink,https,//www.google.com/adsense/login/en_US/?gsessionid=Dc28hZShnCI

15. Double Click, http,//www.doubleclick.com/products/dfa/index.aspx 16. Specificmeida, http,//www.specificmedia.co.uk/