Collaborative Location and Activity Recommendations with GPS History Data

25
Collaborative Location and Activity Recommendations with GPS History Data Vincent W. Zheng , Yu Zheng , Xing Xie , Qiang Yang Hong Kong University of Science and Technology Microsoft Research Asia This work was done when Vincent was doing internship in Microsoft Research A 1

description

Collaborative Location and Activity Recommendations with GPS History Data. Vincent W. Zheng † , Yu Zheng ‡ , Xing Xie ‡ , Qiang Yang † † Hong Kong University of Science and Technology ‡ Microsoft Research Asia. - PowerPoint PPT Presentation

Transcript of Collaborative Location and Activity Recommendations with GPS History Data

Collaborative Location and Activity Recommendations with GPS History Data

Collaborative Location and Activity Recommendationswith GPS History DataVincent W. Zheng, Yu Zheng, Xing Xie, Qiang Yang

Hong Kong University of Science and TechnologyMicrosoft Research AsiaThis work was done when Vincent was doing internship in Microsoft Research Asia.1Introduction and MotivationUsers now sharing GPS trajectories on the WebWisdom of crowd: incorporating users knowledge2

Travel experience:Some places are more popular than the othersUser activities:The food is delicious --> dining at that placeGoal: To Answer 2 Typical Questions3

Q2: where should I go if I want to do something?

(Location recommendation given activity query)Q1: what can I do there if I visit some place?

(Activity recommendation given location query)Problem DefinitionHow to well model the location-activity relationEncode it into a matrix

Example

4An entry denotes how popular an activity is performed at a locationRanking along theColumns or rows541532542431126Forbidden CityBirds NestZhongguancunLocation recommendationTourism:Forbidden City > Birds Nest > ZhongguancunTourismExhibitionShoppingActivity recommendationForbidden City:Tourism > Exhibition > ShoppingContributionsIn practice, its sparse!User comments are few (in out dataset, 6km29310251864108

(2) DBSCAN[=0.001, MinPts = 4]K-means[K = 200](3) OPTICS[=0.05, MinPts = 4](4) Grid clustering[d=300]Location-Activity ExtractionLocation-activity matrix10

Activity: tourismWe took a tour bus to see around along the forbidden city moat GPS: 39.903, 116.391, 14/9/2009 15:25Stay Region: 39.910, 116.400 (Forbidden City)+1Forbidden CityTourismZhongguancunFoodLocation-Activity Matrix

User comments are few -> this matrix is sparse!Our objective: to fill this matrix.Road Map11

?Location Feature ExtractionLocation features: Points of Interests (POIs)12

restaurantbankshopping mallrestaurantStay Region: 39.980, 116.306 (Zhongguancun)[restaurant, bank, shop] = [3, 1, 1]TF-IDF style normalization*: feature = [0.13, 0.32, 0.18]restaurantTF-IDF (Term-Frequency Inverse Document Frequency):

Example: Assume in 10 locations, 8 have restaurants (less distinguishing), while 2 have banks and 4 have shops:tf-idf(restaurant) = (3/5)*log(10/8) = 0.13tf-idf(bank) = (1/5)*log(10/2) = 0.32tf-idf(shop) = (1/5)*log(10/4) = 0.18

0.130.32Forbidden CityrestaurantbankLocation-Feature MatrixZhongguancunRoad Map13

?Activity Correlation ExtractionHow possible for one activity to happen, if another activity happens?Automatically mined from the Web, potentially useful when #(act) is large14Most mined correlations are reasonable. Example: Tourism with other activities.Web search (from Bing)Human design (average on 8 subjects)

Tourism and Amusement and Food and DrinkCorrelation = h(1.16M),where h is a normalization func.Tourism-Shoppingmore likely to happen together thanTourism-SportsRoad Map15

Solution: Collaborative Location and Activity Recommendation (CLAR)Collaborative filtering, with collective matrix factorization

Low rank approximation, by minimizing

After getting U* and V*, reconstruct the incomplete XEfficient: complexity is linear to #(loc), can handle large data

where U, V and W are the low-dimensional representations for the locations, activities and location features, respectively. I is an indicatory matrix.

16ExperimentsData2.5 years (2007.4-2009.10)162 users13K GPS trajectories, 4M GPS points, 140K kilometers530 commentsEvaluationInvite 5 subjects to give ratings independentlyLocation recommendationMeasured on top 10 returned locations for each of the 5 activitiesActivity recommendationMeasured on the 5 activities for top 20 popular locations with most visitsNormalized discounted cumulative gain (nDCG)17

Example: for a rating

Result 1: System performancesImpact of location feature information (i.e. 1 @ Fig.11)Impact of activity correlation information (i.e. 2 @ Fig.12)ObservationsThe weight for each information source should be moderateUsing both sources outperforms using single source (i.e. 1=0, 2=0)

18Result 2: Baseline ComparisonSingle collaborative filtering (SCF)Using only the location-activity matrix Unifying collaborative filtering (UCF)Using all 3 matrices, but in a different wayFor each missing entry, combine the entries belonging to the top N similar locations top N similar activities in a weighted way

One-tail t-test p1