Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... ·...
Transcript of Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... ·...
![Page 1: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/1.jpg)
Classification in Mobility Data Mining
![Page 2: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/2.jpg)
Activity Recognition – Semantic Enrichment
![Page 3: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/3.jpg)
Recognition throughPoints-of-Interest
Given a dataset of GPS tracks of private vehicles, we annotate trajectories with the most probable activities performed by the user.
The method associates the list of possible POIs (with corresponding probabilities) visited by a user moving by car when he stops.
A mapping between POIs categories and Transportation Engineering activities is necessary.
?
Gym LeisureHospital Services
Restaurant Food
![Page 4: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/4.jpg)
The enrichment process
● POI collection: Collected in an automatic way, e.g. from Google Places.
● Association POI – Activity: Each POI is associated to a ``activity". For example Restaurant → Eating/Food, Library → Education, etc.
● Basic elements/characteristics:– C(POI) = {category, opening hour, location}– C(Trajectory) = {duration of the stop, stop location, time of the day}– C(User) = {max walking distance}
● Computation of the probability to visit a POI/ to make an activity: For each POI, the probability of ``being visited'' is a function of the POI, the trajectory and the user features.
● Annotated trajectory: The list of possible activities is then associated to a Stop based on the corresponding probability of visiting POIs
![Page 5: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/5.jpg)
Input & Output
Lat; LonTimeStamp: Sun 10:55 – 12:05
Wd = 500 m
Bank: Mon – Fri [8:00 – 15:30]
Dentist: Mon – Sat [9:00 – 13:00] [15:30 – 18:00]
Church: Mon – Sat [18:00 – 19:00] Sun [11:00 – 12:00]
Primary School: Mon – Sat [8:00 – 13:00]
![Page 6: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/6.jpg)
Input & Output
- Stop: Lat; Lon- TimeStamp: Sun 10:55 – 12:05
Wd = 500 m
Church Services [80%]Bar Food [20%]
Pastry Food [100%]
![Page 7: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/7.jpg)
Inferring Activities from social data
![Page 8: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/8.jpg)
Extraction of personal places from Twitter trajectories in Dublin area
The points of each trajectory taken separately were grouped into spatial clusters of maximal radius 150m. For groups with at least 5 points, convex hulls have been built and spatial buffers of small width (5m) around them.1,461,582 points belong to the clusters (89% of 1,637,346); 24,935 personal places have been extracted.
Examples of extracted places
Statistical distribution of the number of places per person
![Page 9: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/9.jpg)
Recognition of the message topics, generation of topical feature vectors, and summarization by the personal places
Topics have been assigned to 208,391 messages (14.3% of the 1,461,582 points belonging to the personal places)
...
1) Some places did not get topic summaries (about 20% of the places)2) In many places the topics are very much mixed3) The topics are not necessarily representative of the place type
(e.g., topics near a supermarket: family, education, work, cafe, shopping, services, health care, friends, game, private event, food, sweets, coffee)
![Page 10: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/10.jpg)
Obtaining daily time series of place visits and comparison with exemplary temporal profiles
The daily time series of place visits have been obtained through aggregation of daily trajectories using only relevant places for each trajectory. The aggregation was done separately for the work days from Monday to Thursday, and for Saturday, Sunday, and Friday.
Exemplary temporal profiles of different activities or visits to different types of places
The time series of place visits are compared to the exemplary time profiles by means of the Dynamic Time Warping (DTW) distance function. Resulting scores: from 0 (no similarity) to 1 (very high similarity). 15,950 places (64% of all) have no similarity to any of the exemplary time patterns. 4,732 places (19%) have the maximal similarity score of 0.8 or higher; 4,179 of them (16.8% of all) were visited in 6 or more days.
![Page 11: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/11.jpg)
Time series with high similarity to “work” (>=0.8)
1,520 places (6.1% of all). These places have also high similarity to “education”, “transport”, and “lunch”.
The time series similarity scores have been combined with the relative frequencies of the topics using a combination matrix
In 233 places out of the initial 1,520 (15%, 0.9% of all places) the similarity to the “work” profile has been reinforced based on the topic frequencies.
![Page 12: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/12.jpg)
Classification of the places according to the highest combined score (minimum 0.8)
20,247 places (81.2%) are not classified; 4,688 (18.8%) are classified, of them 4,048 (16.2%) were visited in at least 6 days
![Page 13: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/13.jpg)
Activity Recognition – Inductive approach
![Page 14: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/14.jpg)
Eigen-behavioursInput Left: subject’s behavior over the course of 113 days for five
situations / activities Right: same data represented as a binary matrix of 113 days (D)
by 120 (H, which is 24 multiplied by the five possible situations)
![Page 15: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/15.jpg)
Eigen-behavioursMethod Are there key activity distributions from which to infer all others
through linear combination? Same idea as PCA
Key distributions
![Page 16: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/16.jpg)
Eigen-behavioursOutput Set of 3 representative eigen-behaviours Each user's activity can be rewritten as their linear combination
activities
![Page 17: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/17.jpg)
Eigen-behavioursExample
![Page 18: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/18.jpg)
Individual Mobility Networks
![Page 19: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/19.jpg)
How to synthesize Individual Mobility?
Mobility Data Mining methods automatically extract relevant episodes: locations and movements.
![Page 20: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/20.jpg)
Rank individual preferred locations
![Page 21: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/21.jpg)
How to synthesize Individual Mobility?
Graph abstraction based on locations (nodes) and movements (edges)
![Page 22: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/22.jpg)
How to synthesize Individual Mobility?
High level representation
Aggregation of sensitive data
Abstraction from real geography
![Page 23: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/23.jpg)
From raw movement…
![Page 24: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/24.jpg)
… to annotated data
![Page 25: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/25.jpg)
1) Build from data anIndividual Mobility Network Individual Mobility Network (IMN)
3) Use a cascading classification with label propagation (ABC classifier)
2) Extract structural features from the IMN
![Page 26: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/26.jpg)
Extracting the IMN
τ(9) = 8
τ(11) = 3τ(0) = 25
ω(1, 2) = 4
ω(2, 1) = 2
ω(0
, 6) =
4
![Page 27: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/27.jpg)
Extracting the IMN
Trip Features
Length
Duration
Time Interval
Average Speed
Network Features
centrality clustering coefficientaverage path length
predictability entropy
hubbiness degreebetweenness
volume edge weightflow per location
![Page 28: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/28.jpg)
Extracting the IMN
ω(1, 2) = 2
ω(2, 1) = 3
from to weight ccFrom ccTo duration
1 2 2 0.22 0.12 10 min
1 2 2 0.22 0.12 5 min
2 1 3 0.12 0.22 4 min
2 1 3 0.12 0.22 6 min
2 1 3 0.12 0.22 4 min
![Page 29: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/29.jpg)
ABC Classifier● Principles:
– The activities of a user should be predicted as a whole, not separately
– Some activities are easy to classify
– Other activities might benefit from contextual information obtained from previous predictions
● E.g.: a place frequently visited after work might be more likely to be leisure / shopping
![Page 30: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/30.jpg)
ABC Classifier● Inspired by Nested Cascade Classification
![Page 31: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/31.jpg)
ABC Classifier● Inspired by Nested Cascade Classification
![Page 32: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/32.jpg)
The ABC classifier
![Page 33: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/33.jpg)
The ABC classifier
![Page 34: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/34.jpg)
The ABC classifier
![Page 35: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/35.jpg)
The ABC classifier
![Page 36: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/36.jpg)
Experiments
GPS traces
6,953 trips65 vehicles
![Page 37: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/37.jpg)
Semantic Mobility AnalyticsTemporal Analysis
● Pisa traffic
In Out
![Page 38: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/38.jpg)
Semantic Mobility AnalyticsTemporal Analysis
● Calci traffic
In Out
![Page 39: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/39.jpg)
Semantic Mobility AnalyticsTemporal Analysis
![Page 40: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/40.jpg)
User Profiling
In computer science, is the process of construction and extraction of models representing user behavior generated by computerized data analysis.Are employed to study, analyze and understand human behaviors and interactions.Are exploited by many applications to make predictions, to give suggestions etc.
![Page 41: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/41.jpg)
MYWay: Trajectory Prediction
![Page 42: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/42.jpg)
Individual and Collective Profile
Individual ProfileInput: Individual DataOutput: Individual Patterns
Collective ProfileInput: Collectivity DataOutput: Collective Patterns
![Page 43: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/43.jpg)
Prediction using probability mixture models
J. Ghosh, M. J. Beal, H. Q. Ngo, and C. Qiao. On profiling mobility and predicting locations of wireless users. 2006.
J. Ghosh, H. Q. Ngo, and C. Qiao. Mobility profile based routing within intermittently connected mobile ad hoc networks (icman). 2006.
I. F. Akyildiz and W. Wang. The predictive user mobility prole framework for wireless multimedia networks. 2004.
![Page 44: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/44.jpg)
Prediction based on individual and collective preferences
F. Calabrese, G. Di Lorenzo, and C. Ratti. Human mobility prediction based onindividual and collective geographical preferences. 2010.
![Page 45: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/45.jpg)
Prediction using complex networks and probability
D. Barth, S. Bellahsene, and L. Kloul. Mobility prediction using mobile user profiles. 2011.
D. Barth, S. Bellahsene, and L. Kloul. Combining local and global proles for mobility prediction in lte femtocells. 2012.
![Page 46: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/46.jpg)
Collective prediction using t-patterns
F. Giannotti, M. Nanni, F. Pinelli, and D. Pedreschi. Trajectory pattern mining. 2007.
A. Monreale, F. Pinelli, R. Trasarti, and F. Giannotti. Wherenext: a location predictor on trajectory pattern mining. 2009.
![Page 47: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/47.jpg)
Mobility Profiling
A concise model ables to describe the user’s mobility in terms of representative movements, i.e. routines.
This model is called Mobility Profile.
Mining mobility user profiles for car pooling. Trasarti, Pinelli, Nanni, Giannotti. KDD 2011
![Page 48: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/48.jpg)
Derived patterns and models: mobility profiles
User history
An ordered sequence of spatio-
temporal points.
Trips construction
Cutting the user history when a stop is detected
Stops Spatial ThresholdStops Temporal Threshold
Grouping
Performing a density based clustering equipped with a spatio temporal distance
function
Spatial TolleranceTemporal Tollerance
Spatio temporal distance
Profile extraction
The medoid of each group becomes user’s routines and the all
set become the user’s mobility profile
Pruning
Groups with a smallNumber of trips are
Pruned
Support Threshold
Trasarti, Pinelli, Nanni, Giannotti. Mining mobility user profiles for car pooling. ACM SIGKDD 2011
![Page 49: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/49.jpg)
Idea in a nutshell
Use the mobility profile to predict the user’s movements. If it is not able to produce a prediction, a collective predictor is used.The collective predictor is built using the mobility profiles of the crowd.
![Page 50: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/50.jpg)
Experimental setting
Starting from a dataset of 1 month of movements, 5.000 users and 326.000 trajectories. We divided the training set, i.e. 3 weeks and as test set the remaining last week.
The trajectories in the test set are cut to become the queries for the predictor. The cuts tested are taking the first 33% or 66% of the trajectories.
![Page 51: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/51.jpg)
Extracting the Mobility Profiles
The first step is to extract the mobility profiles from the training set. In order to assess the quality of them an empirical analysis is performed.
Routines per user distribution (left), trajectories and routines time start distribution (right) and the dataset coverage (bottom)
![Page 52: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/52.jpg)
Results
MyWay obtains good results which are comparable to a global predictor built on top of the whole set of trajectories.
Test 33% Test 66%
![Page 53: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/53.jpg)
proactive car pooling
Project ICON
![Page 54: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/54.jpg)
Carpooling cycleContext
Several initiatives, especially on the web
![Page 55: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/55.jpg)
Carpooling cycleDistinctive features
Users manually insert and update their rides
Users search and contact candidate pals
Users make individual, “local” choice
System autonomously detect systematic trips
System automatically suggest pairings
System seeks globally optimal allocation
Traditional approach vs. ICON cycle
![Page 56: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/56.jpg)
Carpooling cycleAssumptions
Users provide access to their mobility traces
![Page 57: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/57.jpg)
Carpooling cycleStep 1: Inferring Individual Systematic Mobility
• Extraction of Mobility Profiles– Describes an abstraction in space and time of
the systematic movements of a user.– Exceptional movements are completely
ignored.– Based on trajectory clustering with noise
removalIndividual History Trajectory Clusters Routines
![Page 58: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/58.jpg)
Carpooling cycleStep 2: Build Network of possible carpool matches
Based on “routine containment”− One user can pick up the other along
his trip
Carpooling network− Nodes = users− Edges = pairs of users
with matching routines
passenger
driver
![Page 59: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/59.jpg)
Pro-active suggestions of sharing rides opportunities without the need for the user to explicitly specify the trips of interest.
Matching two routines:
Mobility profile share-ability:
Application: Car pooling
![Page 60: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/60.jpg)
Carpooling cycleStep 3: Optimal allocation of drivers-passengers
• Given a Carpooling Network N, select a subset of edges that minimizes |S|
– S = set of circulating vehicles
provided that the edges are coherent, i.e.:
– indegree(n)=0 OR outdegree(n)=0 (a driver cannot be a passenger)
– indegree(n) ≤ capacity(n)
N
N'
![Page 61: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/61.jpg)
Carpooling cycleThe “simple” ICON Loop
Input mobility data
DM: Extract mobility profiles
Build Carpooling network
CP: Optimal allocation
Users accept/reject suggestions
![Page 62: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/62.jpg)
Carpooling cycleImprovement
• In carpooling (especially if proactive) users might not like the suggested matches– Impossible to know who will accept a given
match– Modeling acceptance might improve results
• Two new components– Learning mechanism to guess success
probability of a carpooling match– Optimization task exploits it to offer
solution with best expected overall success
![Page 63: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/63.jpg)
ML: Learn/update success model
& apply to network
Carpooling cycleRevised ICON Loop
Input mobility data
DM: Extract mobility profiles
Weighted Carpooling network
CP: Allocation with best expected success
Users accept/reject suggestions
Training data
![Page 64: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/64.jpg)
S. Rinzivillo, S. Mainardi, F. Pezzoni, M. Coscia, D. Pedreschi, F. Giannotti
Discovering the Geographical Borders of Human Mobility
KI - Künstliche Intelligenz, 2012.
Networks as a mining tool
![Page 65: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/65.jpg)
Mobility coverages
![Page 66: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/66.jpg)
Step 1: spatial regions
![Page 67: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/67.jpg)
Step 2: evaluate flows among regions
![Page 68: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/68.jpg)
Step 3: forget geography
![Page 69: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/69.jpg)
Step 4: perform community detection
![Page 70: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/70.jpg)
Step 4: perform community detection
![Page 71: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/71.jpg)
Step 5: map back to geography
![Page 72: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/72.jpg)
Step 6: draw borders
![Page 73: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/73.jpg)
Final result
![Page 74: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/74.jpg)
Final result: compare with municipality borders
![Page 75: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/75.jpg)
Borders in different time periodsOnly weekdays movements Only weekend movements
Similar to global clustering: strong influence of systematic movements
Strong fragmentation: the influence of systematic movements (home-work) is missing
![Page 76: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/76.jpg)
Borders at regional scale
![Page 77: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/77.jpg)
Final results
![Page 78: Classification in Mobility Data Miningdidawiki.cli.di.unipi.it/.../dm/dm2_2015_trajectory... · Right: same data represented as a binary matrix of 113 days (D) by 120 (H, which is](https://reader033.fdocuments.net/reader033/viewer/2022052803/5f26a626602d57478174a155/html5/thumbnails/78.jpg)
Comparison with “new provinces”