Collective Traffic Prediction with Partially Observed...

34
Motivation Related Works CTP Method Experiments Summary Collective Traffic Prediction with Partially Observed Traffic History using Location-Based Social Media Xinyue Liu, Xiangnan Kong, Yanhua Li Worcester Polytechnic Institute February 22, 2017 1 / 34

Transcript of Collective Traffic Prediction with Partially Observed...

Motivation Related Works CTP Method Experiments Summary

Collective Traffic Prediction with PartiallyObserved Traffic History using Location-Based

Social Media

Xinyue Liu, Xiangnan Kong, Yanhua Li

Worcester Polytechnic Institute

February 22, 2017

1 / 34

Motivation Related Works CTP Method Experiments Summary

About me

2 / 34

Motivation Related Works CTP Method Experiments Summary

About me

3 / 34

Motivation Related Works CTP Method Experiments Summary

About me

◦ I only know Python (2), and it is great.

◦ I think JavaScript, Ruby, Haskell... are cool, but I am too lazyto learn them.

◦ I hate C++.

4 / 34

Motivation Related Works CTP Method Experiments Summary

My Research Interests

◦ Social Network Analysis [CIKM’16, SDM’17b]

◦ Recommender Systems [SDM’16]

◦ Brain Network [SDM’17a, IJCNN17]

5 / 34

Motivation Related Works CTP Method Experiments Summary

Overview

1 Motivation

2 Related Works

3 CTP Method

4 Experiments

5 Summary

6 / 34

Motivation Related Works CTP Method Experiments Summary

Why Traffic Prediction?

◦ Excessive traffic causestravel delays, resourcewasting, and pollution.

◦ In 2011, traffic congestioncosts urban Americans 5.5billion hours of travel delay,2.9 billion gallons of extrafuel, for a total congestioncost of $121 billion.

7 / 34

Motivation Related Works CTP Method Experiments Summary

Why (Location-Based) Social Media?

8 AM 4 PM 11 PM

Temporal Data

Traffic Networks Location-Based Social Media

Traffic Condition

Location Associations

Semantic Data

Sensor

“Traffic jam on Storrow Drive, Boston, Massachusetts”

◦ Location-Based Social Media (LBSM) is popular, can be usedas mobile sensors.

◦ Semantic and spatial information from social media can behelpful.

8 / 34

Motivation Related Works CTP Method Experiments Summary

Challenges

◦ Lack of historical traffic data in partial regions.

In real-world road systems, only a small fraction of the roadsegments are deployed with sensors.It is difficult to predict traffic without traffic history.

◦ Sparsity of LBSM information at fine granularity.

Table: Average # of tweets in each region under different spatiotemporalresolutions

Temporal Resolution Spatial Resolution Ave. #Tweets12 hours 1 × 1 47,1131 hour 1 × 1 3,9261 hour 2 × 2 1,3061 hour 3 × 3 5541 hour 4 × 4 3891 hour 30 × 30 15

9 / 34

Motivation Related Works CTP Method Experiments Summary

Conventional Methods

◦ Auto Regression [Smith and Demetsky, 1997, Journal ofTransportation Engineering]

◦ Tweet Semantics [He et al.,2013, IJCAI]

10 / 34

Motivation Related Works CTP Method Experiments Summary

Auto Regression [Smith and Demetsky, 1997]

t time

Prediction

spatio-temporal dependencies

Historical Traffic Data

◦ v(t)g = α + β1v

(t−1)g + β2v

(t−2)g

◦ Fail to work for locations without traffic history.

11 / 34

Motivation Related Works CTP Method Experiments Summary

Tweet Semantics [He et al.,2013]

t time

Prediction

Social Media

Historical Traffic Data

a

ce

a b

ce d

a b

ce d

◦ Consider each location independently.

◦ Extract tweet semantics as bag-of-words feature for eachlocation during a 12-hour time window.

◦ Build an auto regression-like model using both traffic historyand tweet semantics.

◦ Fail to work for locations without traffic history.12 / 34

Motivation Related Works CTP Method Experiments Summary

Illustration of CTP [Our Method]

t time

Prediction

congestionspatio-temporal dependencies

time

abcde

Local-based Social Media

Historical Traffic Data

road network

a b

c

e d

regions without any sensor

◦ Incorporate LBSM information at finer spatiotemporalgranularity.

◦ Consider different locations collectively.

◦ It works for locations without traffic history!

13 / 34

Motivation Related Works CTP Method Experiments Summary

Social Media Semantic Vectors

14 / 34

Motivation Related Works CTP Method Experiments Summary

Spatio-temporal Dependencies: I

t-1

t

vi(t−1)

vj(t−1) vq(t−1)

vp(t−1)

vi(t )

vj(t )

vp(t )

vq(t )

◦ Same as the traffic history in auto regression model.

15 / 34

Motivation Related Works CTP Method Experiments Summary

Spatio-temporal Dependencies: II

t-1

t

vi(t−1)

vj(t−1) vq(t−1)

vp(t−1)

vi(t )

vj(t )

vp(t )

vq(t )

◦ Spatial dependency within a time window.

16 / 34

Motivation Related Works CTP Method Experiments Summary

Spatio-temporal Dependencies: III

t-1

t

vi(t−1)

vj(t−1) vq(t−1)

vp(t−1)

vi(t )

vj(t )

vp(t )

vq(t )

◦ Spatial dependency across time windows.

17 / 34

Motivation Related Works CTP Method Experiments Summary

CTP Method

LBSMSemantics

t-2 t-1Training

𝑣"($)

𝑣&($)

Response

𝑣'($)

◦ assume time lag = 2 for the simplicity here.

◦ response variable (average speed, total traffic flow, etc).

18 / 34

Motivation Related Works CTP Method Experiments Summary

CTP Method

LBSMSemantics

Dependency I(TrafficHistory)

t-2 t-1 t-1t-2Training

𝑣"($)

𝑣&($)

Response

𝑣'($)

19 / 34

Motivation Related Works CTP Method Experiments Summary

CTP Method

𝑣"($%&) 𝑣"

($%()

LBSMSemantics

Dependency I(TrafficHistory)

t-2 t-1 t-1t-2Training

𝑣)($%&) 𝑣)

($%()

𝑣"($)

𝑣)($)

Response

𝑣*($%&) 𝑣*

($%() 𝑣*($)

Retrievethehistoricaldata

20 / 34

Motivation Related Works CTP Method Experiments Summary

CTP Method

LBSMSemantics

Dependency I(TrafficHistory)

Dependency II(Neighbors’Traffic)

t-2 t-1 t-1t-2 tTraining

𝑣"($)

𝑣&($)

Response

𝑣'($)

21 / 34

Motivation Related Works CTP Method Experiments Summary

CTP Method

LBSMSemantics

Dependency I(TrafficHistory)

Dependency II(Neighbors’Traffic)

Dependency III(Neighbors’TrafficHistory)

t-2 t-1 t-1t-2 t t-2 t-1Training

𝑣"($)

𝑣&($)

Response

𝑣'($)

22 / 34

Motivation Related Works CTP Method Experiments Summary

CTP Method

LBSMSemantics

Dependency I

Dependency II

Dependency III

t-2 t-1 t-1t-2 t t-2 t-1Training

Response

Computeusinganaggregationfunction(e.g.average)

• Response=Speed,aggregation function=AVG.• 𝑣"

($) = 50, 𝑣*($) = 45,𝑣,

($) and𝑣-($)areunobserved.

• TheDependency-II Feature fornodeA attimet is:

• (/0(1)+/2

(1))3 = 47.5

𝑣*($)

𝑣6($)

𝑣"($)

23 / 34

Motivation Related Works CTP Method Experiments Summary

CTP Method

LBSMSemantics

Dependency I

Dependency II

Dependency III

t-2 t-1 t-1t-2 t t-2 t-1

0

0 0 0

Training(onlyobserved)

Bootstrap

Response

(unobservedregions)

t-1 t tt-1 t+1 t-1 t

24 / 34

Motivation Related Works CTP Method Experiments Summary

CTP Method

LBSMSemantics

Dependency I

Dependency II

Dependency III

t-2 t-1 t-1t-2 t t-2 t-1

0

0 0 0

Training(onlyobserved)

Bootstrap

Response

(unobservedregions)

t-1 t tt-1 t+1 t-1 t

25 / 34

Motivation Related Works CTP Method Experiments Summary

CTP Method

LBSMSemantics

Dependency I

Dependency II

Dependency III

t-2 t-1 t-1t-2 t t-2 t-1

0 0

Training(onlyobserved)

Response

(unobservedregions)

t-1 t tt-1 t+1 t-1 t IterativeInference

Keepupdating

Keepupdating

26 / 34

Motivation Related Works CTP Method Experiments Summary

Dataset

◦ Traffic DataCollect from the California Performance MeasurementSystem(PeMS) between October 19 and November 28, 2014.31,102,272 entries of traffic records.

◦ LBSM DataCollect tweets from the same area during the same time rangeusing the Twitter streaming API.This collection results in a total number of 2,648,446 tweets.

27 / 34

Motivation Related Works CTP Method Experiments Summary

Compared Methods

◦ TDO[Smith and Demetsky, 1997]: Auto regression model usingtraffic history.

◦ TDO-floor[——–]: Similar to TDO, except it uses full traffichistory.

◦ TwSeO: A degenerated version of [He et al. 2013], usingtweets semantics.

28 / 34

Motivation Related Works CTP Method Experiments Summary

Experimental Setting

◦ Partition the data into two parts, with the beginning (1 − 1u )

as the training set and the remaining 1u as the test set

(u = 3, . . . , 7).

◦ k-fold cross-validation is used to randomly sample 1/k regionsas unobserved (k = 2, 3, 4, 5).

◦ Root Mean Square Error (RMSE) is used to evaluate theperformance.

29 / 34

Motivation Related Works CTP Method Experiments Summary

Results

lowerIsbetter

ourmethod

◦ TDO-floor performs the best by using full traffic history.

◦ The proposed CTP outperforms TDO and TwSeO.

◦ The result shows the effectiveness of incorporating tweetssemantics into the collective inference model.

30 / 34

Motivation Related Works CTP Method Experiments Summary

The effect of r

lowerIsbetter

ourmethod

SparserInformationinLBSM

Figure: Test Ratio = 1/7 (u = 7)

31 / 34

Motivation Related Works CTP Method Experiments Summary

The effect of k

lowerIsbetter

LessUnobservedRegions

ourmethod

Figure: u = 6, r = 5 32 / 34

Motivation Related Works CTP Method Experiments Summary

Summary

◦ Problem StudiedTraffic prediction with partially observed traffic history.

◦ Proposed ModelUsing LBSM data to alleviate the issue of absent traffic history.A collective inference model that exploits the complexspatio-temporal dependencies between road segments as wellas incorporates LBSM semantics in the prediction.

33 / 34

Motivation Related Works CTP Method Experiments Summary

Q&A

Xinyue Liu ([email protected])Xiangnan Kong ([email protected])Yanhua Li ([email protected])

34 / 34