Deep Learning Based Urban Analytics Platform: Applications ... · Deep Learning Based Urban...

9
Deep Learning Based Urban Analytics Platform: Applications to Traffic Flow Modeling and Prediction Archit Parnami Dept. of Computer Science University of North Carolina - Charlotte [email protected] Prajval Bavi Dept. of Computer Science University of North Carolina - Charlotte [email protected] Dimitris Papanikolaou Dept. of Software & Information Systems University of North Carolina - Charlotte [email protected] Srinivas Akella Dept. of Computer Science University of North Carolina - Charlotte [email protected] Minwoo Lee Dept. of Computer Science University of North Carolina - Charlotte [email protected] Siddharth Krishnan Dept. of Computer Science University of North Carolina - Charlotte [email protected] ABSTRACT Cities have become more networked and wired, thus generating unprecedented amounts of operational and behavioral data. For instance, a city’s traffic is instrumented by online mapping services (Google maps) in conjunction with crowd-sourcing (Waze). In this work, we present an Urban Analytics Platform (UAP) that demon- strates a systematic framework to tackle data-driven problems that the modern urban settings present. UAP embodies an end-to-end system for ingesting multiple data sources and processing them via neural network models to tackle predictive urban computing problems, like traffic flow. Results show improved traffic flow pre- diction and how we can utilize the improved prediction to inform on the possible weather conditions for abnormal traffic patterns. KEYWORDS Data Fusion, Urban Data, Traffic, Weather, Tweet, Surrogate Data ACM Reference Format: Archit Parnami, Prajval Bavi, Dimitris Papanikolaou, Srinivas Akella, Min- woo Lee, and Siddharth Krishnan. 2018. Deep Learning Based Urban Ana- lytics Platform: Applications to Traffic Flow Modeling and Prediction. In Proceedings of Mining Urban Data Workshop (MUD3). ACM, New York, NY, USA, 9 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn 1 INTRODUCTION In today’s data-rich world, datafication of critical infrastructures like transportation — road networks and public services — has be- come an integral aspect of the unprecedented urbanization that we are witnessing. Modeling and understanding traffic flow has always been a problem of interest for city planners. Web services like Google Maps, Waze, online weather monitoring, and surrogate sources like Twitter have become useful instruments to sense and Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). MUD3, Aug 2018, London, UK © 2018 Copyright held by the owner/author(s). ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. https://doi.org/10.1145/nnnnnnn.nnnnnnn Figure 1: Data Fusion Engine. The UAP’s data fusion engine aims to harness information from multiple data sources such as Google maps, weather forecasts, twitter and use it for applications like city planning, traffic prediction and detection. predict traffic. Towards that end, we design an Urban Analytics Platform (UAP) (for the city of Charlotte) that will systematically ingest multi-modal urban data from traditional sources and surro- gate sources to tackle urban problems in a data-driven fashion. In this work, we describe how UAP will embody predictive models to forecast and inform the city about the traffic flow and enable efficient operation of shared mobility systems. The larger goal of UAP is to effectively and seamlessly integrate data from the city (community safety data, planning and zoning, transportation, neighborhoods & housing, and city government via http://clt-charlotte.opendata.arcgis.com/) with surrogate data sources like social media (Twitter, Facebook, Reddit), transporta- tion (GoogleMaps, Waze) and weather data (OpenWeatherMap) as represented in Figure 1. The core component of the platform is a data fusion engine, which will combine the various data sources ingested into an application specific repository. For instance, in this work we demonstrate how UAP has utilized Waze to make traf- fic prediction and OpenWeatherMap to detect anomalies in traffic. UAP will further equip City of Charlotte with well-enriched urban datasets and “first principle” models that position us to answer

Transcript of Deep Learning Based Urban Analytics Platform: Applications ... · Deep Learning Based Urban...

Page 1: Deep Learning Based Urban Analytics Platform: Applications ... · Deep Learning Based Urban Analytics Platform: Applications to Traffic Flow Modeling and Prediction MUD3, Aug 2018,

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and PredictionArchit Parnami

Dept of Computer ScienceUniversity of North Carolina -

Charlotteaparnamiunccedu

Prajval BaviDept of Computer Science

University of North Carolina -Charlotte

pbaviunccedu

Dimitris PapanikolaouDept of Software amp Information

SystemsUniversity of North Carolina -

Charlottedpapanikunccedu

Srinivas AkellaDept of Computer Science

University of North Carolina -Charlotte

sakellaunccedu

Minwoo LeeDept of Computer Science

University of North Carolina -Charlotte

minwooleeunccedu

Siddharth KrishnanDept of Computer Science

University of North Carolina -Charlotte

skrishnanunccedu

ABSTRACTCities have become more networked and wired thus generatingunprecedented amounts of operational and behavioral data Forinstance a cityrsquos traffic is instrumented by online mapping services(Google maps) in conjunction with crowd-sourcing (Waze) In thiswork we present an Urban Analytics Platform (UAP) that demon-strates a systematic framework to tackle data-driven problems thatthe modern urban settings present UAP embodies an end-to-endsystem for ingesting multiple data sources and processing themvia neural network models to tackle predictive urban computingproblems like traffic flow Results show improved traffic flow pre-diction and how we can utilize the improved prediction to informon the possible weather conditions for abnormal traffic patterns

KEYWORDSData Fusion Urban Data Traffic Weather Tweet Surrogate DataACM Reference FormatArchit Parnami Prajval Bavi Dimitris Papanikolaou Srinivas Akella Min-woo Lee and Siddharth Krishnan 2018 Deep Learning Based Urban Ana-lytics Platform Applications to Traffic Flow Modeling and Prediction InProceedings of Mining Urban Data Workshop (MUD3) ACM New York NYUSA 9 pages httpsdoiorg101145nnnnnnnnnnnnnn

1 INTRODUCTIONIn todayrsquos data-rich world datafication of critical infrastructureslike transportation mdash road networks and public services mdash has be-come an integral aspect of the unprecedented urbanization thatwe are witnessing Modeling and understanding traffic flow hasalways been a problem of interest for city planners Web serviceslike Google Maps Waze online weather monitoring and surrogatesources like Twitter have become useful instruments to sense and

Permission to make digital or hard copies of part or all of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page Copyrights for third-party components of this work must be honoredFor all other uses contact the ownerauthor(s)MUD3 Aug 2018 London UKcopy 2018 Copyright held by the ownerauthor(s)ACM ISBN 978-x-xxxx-xxxx-xYYMMhttpsdoiorg101145nnnnnnnnnnnnnn

Figure 1 Data Fusion Engine The UAPrsquos data fusion engineaims to harness information from multiple data sources such asGoogle maps weather forecasts twitter and use it for applicationslike city planning traffic prediction and detection

predict traffic Towards that end we design an Urban AnalyticsPlatform (UAP) (for the city of Charlotte) that will systematicallyingest multi-modal urban data from traditional sources and surro-gate sources to tackle urban problems in a data-driven fashion Inthis work we describe how UAP will embody predictive modelsto forecast and inform the city about the traffic flow and enableefficient operation of shared mobility systems

The larger goal of UAP is to effectively and seamlessly integratedata from the city (community safety data planning and zoningtransportation neighborhoods amp housing and city governmentvia httpclt-charlotteopendataarcgiscom) with surrogate datasources like social media (Twitter Facebook Reddit) transporta-tion (GoogleMaps Waze) and weather data (OpenWeatherMap) asrepresented in Figure 1 The core component of the platform is adata fusion engine which will combine the various data sourcesingested into an application specific repository For instance in thiswork we demonstrate how UAP has utilized Waze to make traf-fic prediction and OpenWeatherMap to detect anomalies in trafficUAP will further equip City of Charlotte with well-enriched urbandatasets and ldquofirst principlerdquo models that position us to answer

MUD3 Aug 2018 London UK Archit et al

trans-disciplinary urban research questions using a data drivendiscovery and analytics approach

Summary of ResultsWe utilize a stacked LSTM for making accurate traffic predictionsTraining with selective data of weekdays and weekends separatelywe further notice an improvement in prediction results The trainedLSTM prediction model is then used to detect abnormal traffic tocorrelate with the weather conditions The results are presented infollowing sections

2 RELATEDWORKData fusion as defined byWaltz and Llinas [22] is ldquoAmultilevel mul-tifaceted process dealing with the automatic detection associationcorrelation estimation and combination of data and informationfrom single and multiple sources to achieve refined position andidentify estimates and complete and timely assessments of situ-ations and threats and their significancerdquo With so many sourcesof data multi-sensor data fusion helps us to understand the en-vironment and act accordingly Baloch et al [7] describe a wayto gather more accurate and additional data using more than onesensor They propose a layered approach for context acquisitionand filtering using multiple subtasks

Leveraging data fusion techniques Faouzil et al [10] describethe state-of-the-art practice of sensor data fusion to traffic data andthey reported that data fusion techniques indeed provide encour-aging results However there are also challenges when there is awidespread use of data fusion in transportation field like obtainingdata with accuracy to make application effective dynamic and theneed to process the data in real time Bachmann et al [6] inves-tigate how multi-sensor data fusion of loop detectors Bluetooth-enabled devices vehicles equipped with GPS devices improved theestimation accuracy and concluded that accuracy depends on thetechnology the number of probe vehicles and traffic state He etal [12] propose an optimization framework that improves trafficprediction using Twitter semantic data

A major issue in getting traffic flow information in real time isthat the majority of the links are not equipped with traffic sensorAbadi et al [4] used dynamic traffic simulator to generate flows inall links using the available traffic information estimated demandand historical data Further they used an autoregressive model topredict the traffic flows up to 30 min ahead using the real-timeand estimated traffic data With huge traffic data available theproblem of what data is relevant comes in to question Polson etal [18] have shown that data from the recent observations of thetraffic conditions (ie within last 40 min) are stronger predictorsrather than historical values ie measurements from 24 hours agoLv et al [17] proposed the use of stacked autoencoder to learn thelatent traffic flow features representation like nonlinear spatial andtemporal correlation from traffic data and showed the performanceof the proposedmethod to be superior to the other machine learningmethods such as support vector machines radial basis functionsand neural networks

Recently Long Short Term Memory (LSTM) [13 19] have beenused to predict various times series forecasting [5 14 16 20 21] and

achieved the state-of-the-art results For traffic congestion predic-tion Chen et al [8] have proposed an LSTM model They proposeda classification technique to determine congestions and achievedimproved accuracies Our work extends their approach and uti-lizes LSTMs for making estimated traffic predictions Moreoverwe incorporate and utilize weather statistics in assessing trafficconditions The proposed approach is expected to further broadento inform with various surrogate data source

3 DATA COLLECTION FRAMEWORK31 Traffic DataOur traffic data consists of the estimated time of travel from PointA to Point B using a car as a medium of transportation The twosimple ways in which we can get the estimated traffic are

311 Google Maps Using Google Maprsquos Distance Matrix API[1] we can get the estimated travel duration and distance betweena pair of points Googlersquos API calls are free only upto a limit of 2500API calls per day

312 Waze Waze [3] is another navigation app which can pro-vide the traffic information Waze is however crowdsourced anddoes not requires an API key For this project we utilized webscraping to get this traffic information from Waze as representedin Figure 2 Since this data collection is done using web scrappingthere is only a limited number of requests that can be made per sec-ond to the Waze website Through experimentation we determinedthat we can make at most two requests per second approximatelyfrom a unique IP Following equations define the data collectionparameters that determine the number of servers needed for datacollection based on number of sources and destinations

Number of sources = N

Number of destinations = M

Total number of routes = R = N lowastM

Time taken to process a single request in seconds = t

Delay between each request in seconds = d

Limit on number of requests per second = L = 1(t + d)

Total time in seconds to serve all requests = T = RL

Data collection interval or period = P

Buffer time in seconds = B

Number of serversUnique IP needed = S =lfloor T

P minus B

rfloor+ 1

32 Weather DataWeather data includes information such as temperature pressurehumidity and current conditions for the day OpenWeatherMap [2]is one such service that provides daily weather forecasts and currentweather conditions Therefore utilizing the OpenWeatherMap APIwe get access to current and past weather conditions in the city ofCharlotte

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

Figure 2 Web scraping To obtain traffic information we pe-riodically make GET requests to Waze website from different IPlocations

33 Interest Point Selection for Data CollectionSince we are constrained by the number of servers we can have fordata collection we need to carefully select our source and destina-tion points for traffic data collection The source and destinationpoints should be such that they are evenly spread out through-out the city so that they represent all the traffic in the city Thecity of Charlotte has 42 Fire stations which are evenly spread outthroughout the city These are represented by red in Figure 3 Italso has 26 Light Rail Stations linearly distributed in the middle ofthe city These are represented by blue dots Collecting estimatedtravel durations between Fire Stations and Rail Stations periodi-cally throughout the day will act as a proxy in help us modelingthe inflow and outflow of traffic in the city

Figure 3 Source and destination points for traffic collectionThe city of Charlotte has 42 Fire stations spread throughout thecity (represented by red pentagons) which act as sources and has26 rail stations (represented by blue dots) which act as destinationsfor our traffic data collection

Using the equations listed in the previous section we estimatethe number of serversunique IP that will be required to collectdata every 5 minutes using web scraping The chosen parameterslisted in Table 1

Parameters ValuesN 42M 26R 42 lowast 26 = 1092t (experimental) 05d (experimental) 04L 109T 1092 09 = 9828P 560=300B 60

Slfloor 9828300 minus 60

rfloor+ 1 = 5

Table 1 Data Collection Parameters

The buffer time B is the time in seconds to wait after makingall the requests this could be useful for making any read-writeoperations etc and can be adjusted accordingly

34 Sample DataIn the experiment we are using traffic data (Table 2) that is collectedfrom Waze API calls between a given (fire station rail station) pairover a period of 60 days

Date Clock FireStation

Rail Station Time(s)

Distance(m)

3718 00000 40 I-485SouthBlvd

2243 29885

3718 00002 40 Sharon RoadWest

2195 28668

3718 00004 40 Arrowood 2193 282693718 00005 40 Archdale 2061 264953718 00006 40 Tyvola 1976 258083718 00008 40 Woodlawn 2048 26957

Table 2 Traffic data

At the same time we also collect hourly weather informationusingOpenWeatherMapAPI The data (Table 3) contains the currenttemperature minimum and maximum temperature within an hourhumidity wind speed and weather description (rain light rainthunderstorm sky is clear)

MUD3 Aug 2018 London UK Archit et al

Date Time Temp(K)

TempMin

TempMax

Press-ure

Humi-dity

Weath-er

3718 00000 27864 2771 28015 1008 93 mist3718 10000 27814 27615 27915 1007 93 mist3718 20000 27806 27615 27915 1007 93 mist3718 30000 27831 27715 27915 1006 100 mist3718 40000 27865 27715 28015 1006 100 mist3718 50000 27873 27715 28015 1006 100 mist

Table 3 Weather data

35 Traffic AnalysisIn order to make a sensible prediction we first analyze the trafficpatterns on a selected route throughout the week

Figure 4 Average traffic from Monday to Sunday x-axis rep-resents the the (time of the day) and y-axis represents the estimatedtime of travel in seconds on this route

As seen in Figure 4 the traffic flow pattern on weekdays aresimilar There is a spike in traffic from 700 AM to 800 AM whichcan be attributed to people driving to work The same spike ismissing on Saturdays and Sundays reason it being a work holidayThere is also noticeably less traffic on a late Sunday night ie theearly hours of a Monday

Figure 5 Weekday Pattern Overlapping traffic patterns fromMonday to Friday reveals weekday traffic pattern

Traffic patterns from Monday to Friday when plotted (Figure 5)together reveal the general trend of traffic on weekdays

Figure 6 Weekend Pattern

Saturdays and Sundays have similar traffic patterns except theintensity of traffic is lower on Sundays (Figure 6)

4 DEEP LEARNING-BASED TRAFFICPREDICTION MODEL

In this section we discuss Long Short Term Memory (LSTM) net-works a particular type of Recurrent Neural Network (RNN) andhow it can be used to model for traffic flow prediction

41 Recurrent Neural NetworksRecurrent neural networks (RNN) [23] are similar to feedforwardneural networks except it also has connections pointing backwardsA simple RNN composed of one neuron receiving inputs producingan output and sending that output back to itself The output of thesingle RNN is

y(t ) = ϕ(x(t )

T middotwx + y(tminus1)T middotwy + b

)

At each time step t this neuron receives the inputs xt as well as itsown output from the previous time step ytminus1 Similarly for a RNNlayer at each time step t every neuron in the layer receives boththe input vector xt and the output vector from the previous timestep ytminus1

Although RNN has potential to learn the contextual dependencyand it is easy to compute its gradients the range of input sequenceis limited in practice due to the vanishing gradient problem Vari-ous efforts were made to tackle this problem since 1990s Amongthem LSTM model as a variant of standard RNN was proposed byHochreiter and Schmidhuber [13]

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

42 Long Short Term Memory (LSTM)

Figure 7 LSTM Cell

Figure 7 illustrates a LSTM [13 19] architecture This complicatedcell has its state split in two vectors h(t ) and c(t ) where h(t ) isthe short-term state and c(t ) the long-term state As the long-termstate c(tminus1) traverses the network it first goes through a forget gatedropping some memories and then it adds some new memories viathe addition operation which adds the memories that are selectedby the input gate The result c(t ) is the output of the cell withoutany further transformation Also after the addition operation thelong-term state is copied and passed through a tanh function h(t )which is equal to the cellrsquos output for this time step y(t ) The mainlayer is the one that outputs д(t ) which analyses the current inputsx(t ) and the previous short term state h(tminus1) The three other layersf(t ) i(t ) and o(t ) are gate controllers they use the logistic activationfunction so their outputs range from 0 to 1 Their output is fedto the element-wise multiplication operations so if they output0s they close the gate and if they output 1s they open it HenceLSTM can learn to recognize an important input store it in long-term state learn to preserve for as long as it is needed and learn toextract it whenever it is needed Below equations summarizes howthe computation in the LSTM cell takes place

i(t ) = σ (WxiT middot x(t ) +Whi

T middot h(tminus1) + bi )

f(t ) = σ (WxfT middot x(t ) +Whf

T middot h(tminus1) + bf )

o(t ) = σ (WxoT middot x(t ) +Who

T middot h(tminus1) + bo )

g(t ) = tanh(WxдT middot x(t ) +Whд

T middot h(tminus1) + bд)c(t ) = f(t ) otimes c(tminus1) + i(t ) otimes g(t )y(t ) = h(t ) = o(t ) otimes tanh(c(t ))

where Whi Whf Who Whд are the weights of each of the fourlayers for their connections to the previous short-term state h(tminus1)whileWxi Wxf Wxo Wxд are the weight matrices of each of thefour layers for connection to the input layer x(t ) The bi bf bo andbд are the bias term for each of the four layer

43 Prediction ModelWhile the original LSTM model is comprised of a single hiddenLSTM layer followed by a standard feed forward output layer thestacked LSTM has multiple hidden LSTM layers where each layercontains multiple memory cells Studies [9 11 15] have shownthat deep LSTM architectures with several hidden layers can buildup progressively higher levels of representations of sequence dataand thus work more effectively The deep LSTM architectures arenetworks with several stacked LSTM hidden layers in which theoutput of a LSTM hidden layer will be fed as the input into thesubsequent LSTM hidden layer We experimented with differentnumber of LSTM layers stacked onto each other and empiricallydetermined that a stacked LSTM with 4 layers works best for ourdata (Figure 8)

The code base and the data for this study will be made availableon our GitHub repository (httpsgithubcomArchitParnamiUAP)for reproducibility and further research

5 TRAFFIC FLOW PREDICTIONThe following subsections describe the traffic flow prediction modelfor a single route ie shortest route from Fire Station 20 to JW ClayRail Station This route connects the city of Charlotte from oneend to another The model can be generalized to predict traffic onany route Furthermore we assume that the traffic data from Wazeie the estimated time of travel between two points obtained fromWaze is the real representation of actual traffic

51 Data RepresentationSince we collect data every 5 minutes for a route we have 288 timeintervals in a 24 hour day For a day i the data xi is

x i = [t1 t2 t288]The training set has 60 days of data Therefore the entire trainingdata can be represented as

X train =

x1x2

x60

We try to predict the traffic for the 61st day (ie the subsequentday) Therefore the test set is the traffic for the next day

X test =[x61

]

52 Data PreprocessingXtrain is flattened into a single list with number of observation equalto 288times60 = 17280 so

X train = [t1 t2 t17280]The training data is then normalized into a range of [01]

X train = Normalize(X train)Similarly the test data is also normalized

X test = Normalize(X test)An LSTM model expects input in the form of a sequence Thelength of the sequence also know as the number of time-steps is

MUD3 Aug 2018 London UK Archit et al

Figure 8 Architecture of the traffic prediction model The first layer in this architecture represents the input layer which contains thetraffic data from the current and previous time intervals This data is then fed into a stacked LSTM model of four layers which is connectedto a dense layer which then outputs the estimated traffic for the next time interval

a hyperparameter that can be set For this problem the length ofsequence is set to 12 That means we train the LSTM model on thepast hour of traffic data continuously and try to predict the trafficin the next 5-minute interval After this transformation the trainingobservations looks like this

x1 = [t1 t2 t12] y1 = [t13]

x2 = [t13 t14 t24] y2 = [t25]

Finally the training set will have the form

X train =

x1x2

xn

Y train =

y1y2

yn

The test data is also split into sequences of length 12 To predictthe 1st time interval of the 61st day we also add traffic informationfrom the previous day ie the last sequence of the 60th day to ourtest data

53 Prediction ResultsThe red line in Figure 9 represents the real traffic where as the blueline represents the traffic predictions made by our LSTM modeldepicted in Figure 8 The y-axis is the ETA in seconds and x-axis isthe time of the day

While the stacked LSTM architecture fits very well for makingtraffic predictions we observed that having a separate stackedLSTMmodel just for weekdays could improve the prediction resultsThis is based on our previous observations that the weekday traffic

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

Figure 9 Real traffic (red) vs predicted traffic (blue) whentrained on all the historic data (Mean Squared Error = 517)

Figure 10 Real traffic vs predicted traffic when trained onlyon weekdays (Mean Squared Error = 461)

pattern differs from theweekend traffic pattern so having a separatemodel would benefit (Figure 10)

Date WeekdayAverageMSE

WeekdayAverageMAPE

All dayAverageMSE

All dayAverageMAPE

051518 459 466 473 473051618 341 451 349 46051718 397 477 40 486051818 379 472 384 487Average 394 466 401 476

Table 4 Mean squared error and mean absolute percentageerror averaged over 10 training and test runs for weekdayand all day prediction models

From Table 4 we can conclude that the Weekday training modelperformed better than all days training model Figure 12 displaysthe average residue for the Weekday model vs the All day modelfor a sample route The green area represents the intervals wherethe average error for prediction was less when using the Weekdaymodel whereas the red area captures the interval when average

error for prediction was high when using the Weekday model Thefigure reaffirms the efficacy of weekday training model

Figure 11 Training modelsrsquo prediction error Green regionsindicate less prediction errors when model was trained only forweekdays and red regions indicate higher prediction errors

54 Comparing results with a MultilayerPerceptron model

The following results were obtained when we trained an artificialneural network with 3 hidden layers (288168) The inputs to themodel was the day of week(1-7) and the time interval (1-288) andthe output was traffic (estimated time of travel)

Figure 12 Real traffic vs Predicted traffic on 051518 ob-tained using a multilayer perceptron model trained on alldays

Date MSE MAPE051518 79 55051618 37 546051718 492 622051818 735 619Average 597 584

Table 5 Mean squared error and mean absolute percentageerrors obtained using multilayer perceptron model

MUD3 Aug 2018 London UK Archit et al

It can be observed that while the multilayer perceptron model isable to learn the general traffic pattern the LSTM model is able tobetter capture the traffic pattern more accurately with lower MSEand MAPE

6 CORRELATING DEVIATIONS IN TRAFFICPREDICTIONWITHWEATHERCONDITIONS

Traffic conditions are dynamic in nature There are many uncon-trollable factors which can cause deviations in traffic patterns Forexample a sudden thunderstorm an accident or a game event There-fore it is certain that an LSTM model can not make accurate predic-tions all the time considering that there are other factors in playwhich could cause sudden change in traffic conditions Here wefocus on one such factor ie weather and try to correlate and see itsinfluence on traffic conditions

Figure 13 and Figure 14 shows the changing weather conditionsat different time intervals For example it rained around 100-200AM and then there was thunderstorm followed by rain at 1200PM The figures also display the real traffic and the LSTM predictedtraffic In certain regions such as when there was a thunderstormthe deviation or the error in prediction was observed to be morethan normal ie when the sky was clear

Figure 13 Peak deviations in traffic correlated with weatherconditions The arrows point at a traffic condition at a par-ticular time and weather

Figure 14 Deviation in traffic prediction with weather Theorange region highlight the interval of thunderstorm and darkgreen region highlight the interval of heavy rain These two regionscorrespond to the highest deviation in traffic prediction

Intuitively and experimentally from Figure 14 it can be said thattraffic conditions are definitely affected by weather conditions Thisimpact of the anomalous weather on traffic can be measured usingLSTM predictions as observed in this study

7 CONCLUSIONIn this paper we have discussed a low-cost data acquisition andfusion model for traffic flow analysis To get high quality traffic dataresearchers [6 10] must rely on government sources for gettingdata from hardware devices (ie loop detectors GPS) In view ofthe large amount of data that was required for this project paidservices like Google maps are often not an option Thus acquiringhigh quality traffic data is one of the challenges in traffic predictionHowever utilizing the power of webscraping on a crowdsourcedwebsite ie Waze we could gather estimated traffic informationin the city OpenWeatherMap API allows us to retrieve relevantinformation regarding possible weather conditions for unexpectedtraffic flows

Our LSTM model has shown good performance for short termprediction it however lacks the capability for making long-termpredictions Utilizing the data fusion engine we aim to build amore sophisticated model that can not only make long term trafficpredictions but could also inform us about the anomalous traf-fic behavior using diverse data sources such as social networksnews headlines and accident logs This will require developmentof an efficient anomaly detection technique that can automate theinforming process

REFERENCES[1] [n d] Google Maps Distance Matrix API httpsdevelopersgooglecommaps

documentationdistance-matrixintro Accessed2018-05-29[2] [n d] OpenWeatherMap httpsopenweathermaporg Accessed2018-05-29[3] [n d] Waze httpswwwwazecom Accessed2018-05-29[4] Afshin Abadi Tooraj Rajabioun and Petros A Ioannou 2015 Traffic flow predic-

tion for road transportation networks with limited traffic data IEEE transactionson intelligent transportation systems 16 2 (2015) 653ndash662

[5] Michael Auli Michel Galley Chris Quirk and Geoffrey Zweig 2013 Joint lan-guage and translation modeling with recurrent neural networks 3 (2013)

[6] Chris Bachmann Baher Abdulhai Matthew J Roorda and Behzad Moshiri 2013A comparative assessment of multi-sensor data fusion techniques for freeway

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

traffic speed estimation using microsimulation modeling Transportation researchpart C emerging technologies 26 (2013) 33ndash48

[7] Zartasha Baloch Faisal Karim Shaikh and Mukhtiar Ali Unar 2018 A context-aware data fusion approach for health-IoT International Journal of InformationTechnology 10 3 (2018) 241ndash245

[8] Yuan-yuan Chen Yisheng Lv Zhenjiang Li and Fei-Yue Wang 2016 Long short-term memory model for traffic congestion prediction with online open dataIn 2016 IEEE 19th International Conference on Intelligent Transportation Systems(ITSC) IEEE 132ndash137

[9] Zhiyong Cui Ruimin Ke and Yinhai Wang 2018 Deep Bidirectional and Uni-directional LSTM Recurrent Neural Network for Network-wide Traffic SpeedPrediction arXiv preprint arXiv180102143 (2018)

[10] Nour-Eddin El Faouzi and Lawrence AKlein 2016 Data fusion for ITS techniquesand research needs Transportation Research Procedia 15 (2016) 495ndash512

[11] Yoav Goldberg 2016 A primer on neural network models for natural languageprocessing Journal of Artificial Intelligence Research 57 (2016) 345ndash420

[12] Jingrui He Wei Shen Phani Divakaruni Laura Wynter and Rick Lawrence2013 Improving Traffic Prediction with Tweet Semantics In Proceedings ofInternational Joint Conference on Artificial Intelligence (IJCAI) 1387ndash1393

[13] Sepp Hochreiter and Juumlrgen Schmidhuber 1997 Long short-termmemory Neuralcomputation 9 8 (1997) 1735ndash1780

[14] Andrej Karpathy and Li Fei-Fei 2015 Deep visual-semantic alignments forgenerating image descriptions In Proceedings of the IEEE conference on computervision and pattern recognition 3128ndash3137

[15] Xiangang Li and Xihong Wu 2015 Constructing long short-term memory baseddeep recurrent neural networks for large vocabulary speech recognition InAcous-tics Speech and Signal Processing (ICASSP) 2015 IEEE International Conference onIEEE 4520ndash4524

[16] Zachary Chase Lipton David C Kale Charles Elkan and Randall C Wetzel2015 Learning to Diagnose with LSTM Recurrent Neural Networks CoRRabs151103677 (2015) arXiv151103677 httparxivorgabs151103677

[17] Yisheng Lv Yanjie Duan Wenwen Kang Zhengxi Li and Fei-Yue Wang 2015Traffic flow prediction with big data a deep learning approach IEEE Transactionson Intelligent Transportation Systems 16 2 (2015) 865ndash873

[18] Nicholas G Polson and Vadim O Sokolov 2017 Deep learning for short-termtraffic flow prediction Transportation Research Part C Emerging Technologies 79(2017) 1ndash17

[19] Haşim Sak Andrew Senior and Franccediloise Beaufays 2014 Long short-term mem-ory based recurrent neural network architectures for large vocabulary speechrecognition arXiv preprint arXiv14021128 (2014)

[20] Ilya Sutskever Oriol Vinyals and Quoc V Le 2014 Sequence to sequence learningwith neural networks In Advances in neural information processing systems 3104ndash3112

[21] Oriol Vinyals Alexander Toshev Samy Bengio and Dumitru Erhan 2015 Showand tell A neural image caption generator In Computer Vision and Pattern Recog-nition (CVPR) 2015 IEEE Conference on IEEE 3156ndash3164

[22] Edward Waltz James Llinas et al 1990 Multisensor data fusion Vol 685 Artechhouse Boston

[23] Wojciech Zaremba Ilya Sutskever and Oriol Vinyals 2014 Recurrent neuralnetwork regularization arXiv preprint arXiv14092329 (2014)

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Data Collection Framework
    • 31 Traffic Data
    • 32 Weather Data
    • 33 Interest Point Selection for Data Collection
    • 34 Sample Data
    • 35 Traffic Analysis
      • 4 Deep Learning-Based Traffic Prediction Model
        • 41 Recurrent Neural Networks
        • 42 Long Short Term Memory (LSTM)
        • 43 Prediction Model
          • 5 Traffic Flow Prediction
            • 51 Data Representation
            • 52 Data Preprocessing
            • 53 Prediction Results
            • 54 Comparing results with a Multilayer Perceptron model
              • 6 Correlating deviations in traffic prediction with weather conditions
              • 7 Conclusion
              • References
Page 2: Deep Learning Based Urban Analytics Platform: Applications ... · Deep Learning Based Urban Analytics Platform: Applications to Traffic Flow Modeling and Prediction MUD3, Aug 2018,

MUD3 Aug 2018 London UK Archit et al

trans-disciplinary urban research questions using a data drivendiscovery and analytics approach

Summary of ResultsWe utilize a stacked LSTM for making accurate traffic predictionsTraining with selective data of weekdays and weekends separatelywe further notice an improvement in prediction results The trainedLSTM prediction model is then used to detect abnormal traffic tocorrelate with the weather conditions The results are presented infollowing sections

2 RELATEDWORKData fusion as defined byWaltz and Llinas [22] is ldquoAmultilevel mul-tifaceted process dealing with the automatic detection associationcorrelation estimation and combination of data and informationfrom single and multiple sources to achieve refined position andidentify estimates and complete and timely assessments of situ-ations and threats and their significancerdquo With so many sourcesof data multi-sensor data fusion helps us to understand the en-vironment and act accordingly Baloch et al [7] describe a wayto gather more accurate and additional data using more than onesensor They propose a layered approach for context acquisitionand filtering using multiple subtasks

Leveraging data fusion techniques Faouzil et al [10] describethe state-of-the-art practice of sensor data fusion to traffic data andthey reported that data fusion techniques indeed provide encour-aging results However there are also challenges when there is awidespread use of data fusion in transportation field like obtainingdata with accuracy to make application effective dynamic and theneed to process the data in real time Bachmann et al [6] inves-tigate how multi-sensor data fusion of loop detectors Bluetooth-enabled devices vehicles equipped with GPS devices improved theestimation accuracy and concluded that accuracy depends on thetechnology the number of probe vehicles and traffic state He etal [12] propose an optimization framework that improves trafficprediction using Twitter semantic data

A major issue in getting traffic flow information in real time isthat the majority of the links are not equipped with traffic sensorAbadi et al [4] used dynamic traffic simulator to generate flows inall links using the available traffic information estimated demandand historical data Further they used an autoregressive model topredict the traffic flows up to 30 min ahead using the real-timeand estimated traffic data With huge traffic data available theproblem of what data is relevant comes in to question Polson etal [18] have shown that data from the recent observations of thetraffic conditions (ie within last 40 min) are stronger predictorsrather than historical values ie measurements from 24 hours agoLv et al [17] proposed the use of stacked autoencoder to learn thelatent traffic flow features representation like nonlinear spatial andtemporal correlation from traffic data and showed the performanceof the proposedmethod to be superior to the other machine learningmethods such as support vector machines radial basis functionsand neural networks

Recently Long Short Term Memory (LSTM) [13 19] have beenused to predict various times series forecasting [5 14 16 20 21] and

achieved the state-of-the-art results For traffic congestion predic-tion Chen et al [8] have proposed an LSTM model They proposeda classification technique to determine congestions and achievedimproved accuracies Our work extends their approach and uti-lizes LSTMs for making estimated traffic predictions Moreoverwe incorporate and utilize weather statistics in assessing trafficconditions The proposed approach is expected to further broadento inform with various surrogate data source

3 DATA COLLECTION FRAMEWORK31 Traffic DataOur traffic data consists of the estimated time of travel from PointA to Point B using a car as a medium of transportation The twosimple ways in which we can get the estimated traffic are

311 Google Maps Using Google Maprsquos Distance Matrix API[1] we can get the estimated travel duration and distance betweena pair of points Googlersquos API calls are free only upto a limit of 2500API calls per day

312 Waze Waze [3] is another navigation app which can pro-vide the traffic information Waze is however crowdsourced anddoes not requires an API key For this project we utilized webscraping to get this traffic information from Waze as representedin Figure 2 Since this data collection is done using web scrappingthere is only a limited number of requests that can be made per sec-ond to the Waze website Through experimentation we determinedthat we can make at most two requests per second approximatelyfrom a unique IP Following equations define the data collectionparameters that determine the number of servers needed for datacollection based on number of sources and destinations

Number of sources = N

Number of destinations = M

Total number of routes = R = N lowastM

Time taken to process a single request in seconds = t

Delay between each request in seconds = d

Limit on number of requests per second = L = 1(t + d)

Total time in seconds to serve all requests = T = RL

Data collection interval or period = P

Buffer time in seconds = B

Number of serversUnique IP needed = S =lfloor T

P minus B

rfloor+ 1

32 Weather DataWeather data includes information such as temperature pressurehumidity and current conditions for the day OpenWeatherMap [2]is one such service that provides daily weather forecasts and currentweather conditions Therefore utilizing the OpenWeatherMap APIwe get access to current and past weather conditions in the city ofCharlotte

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

Figure 2 Web scraping To obtain traffic information we pe-riodically make GET requests to Waze website from different IPlocations

33 Interest Point Selection for Data CollectionSince we are constrained by the number of servers we can have fordata collection we need to carefully select our source and destina-tion points for traffic data collection The source and destinationpoints should be such that they are evenly spread out through-out the city so that they represent all the traffic in the city Thecity of Charlotte has 42 Fire stations which are evenly spread outthroughout the city These are represented by red in Figure 3 Italso has 26 Light Rail Stations linearly distributed in the middle ofthe city These are represented by blue dots Collecting estimatedtravel durations between Fire Stations and Rail Stations periodi-cally throughout the day will act as a proxy in help us modelingthe inflow and outflow of traffic in the city

Figure 3 Source and destination points for traffic collectionThe city of Charlotte has 42 Fire stations spread throughout thecity (represented by red pentagons) which act as sources and has26 rail stations (represented by blue dots) which act as destinationsfor our traffic data collection

Using the equations listed in the previous section we estimatethe number of serversunique IP that will be required to collectdata every 5 minutes using web scraping The chosen parameterslisted in Table 1

Parameters ValuesN 42M 26R 42 lowast 26 = 1092t (experimental) 05d (experimental) 04L 109T 1092 09 = 9828P 560=300B 60

Slfloor 9828300 minus 60

rfloor+ 1 = 5

Table 1 Data Collection Parameters

The buffer time B is the time in seconds to wait after makingall the requests this could be useful for making any read-writeoperations etc and can be adjusted accordingly

34 Sample DataIn the experiment we are using traffic data (Table 2) that is collectedfrom Waze API calls between a given (fire station rail station) pairover a period of 60 days

Date Clock FireStation

Rail Station Time(s)

Distance(m)

3718 00000 40 I-485SouthBlvd

2243 29885

3718 00002 40 Sharon RoadWest

2195 28668

3718 00004 40 Arrowood 2193 282693718 00005 40 Archdale 2061 264953718 00006 40 Tyvola 1976 258083718 00008 40 Woodlawn 2048 26957

Table 2 Traffic data

At the same time we also collect hourly weather informationusingOpenWeatherMapAPI The data (Table 3) contains the currenttemperature minimum and maximum temperature within an hourhumidity wind speed and weather description (rain light rainthunderstorm sky is clear)

MUD3 Aug 2018 London UK Archit et al

Date Time Temp(K)

TempMin

TempMax

Press-ure

Humi-dity

Weath-er

3718 00000 27864 2771 28015 1008 93 mist3718 10000 27814 27615 27915 1007 93 mist3718 20000 27806 27615 27915 1007 93 mist3718 30000 27831 27715 27915 1006 100 mist3718 40000 27865 27715 28015 1006 100 mist3718 50000 27873 27715 28015 1006 100 mist

Table 3 Weather data

35 Traffic AnalysisIn order to make a sensible prediction we first analyze the trafficpatterns on a selected route throughout the week

Figure 4 Average traffic from Monday to Sunday x-axis rep-resents the the (time of the day) and y-axis represents the estimatedtime of travel in seconds on this route

As seen in Figure 4 the traffic flow pattern on weekdays aresimilar There is a spike in traffic from 700 AM to 800 AM whichcan be attributed to people driving to work The same spike ismissing on Saturdays and Sundays reason it being a work holidayThere is also noticeably less traffic on a late Sunday night ie theearly hours of a Monday

Figure 5 Weekday Pattern Overlapping traffic patterns fromMonday to Friday reveals weekday traffic pattern

Traffic patterns from Monday to Friday when plotted (Figure 5)together reveal the general trend of traffic on weekdays

Figure 6 Weekend Pattern

Saturdays and Sundays have similar traffic patterns except theintensity of traffic is lower on Sundays (Figure 6)

4 DEEP LEARNING-BASED TRAFFICPREDICTION MODEL

In this section we discuss Long Short Term Memory (LSTM) net-works a particular type of Recurrent Neural Network (RNN) andhow it can be used to model for traffic flow prediction

41 Recurrent Neural NetworksRecurrent neural networks (RNN) [23] are similar to feedforwardneural networks except it also has connections pointing backwardsA simple RNN composed of one neuron receiving inputs producingan output and sending that output back to itself The output of thesingle RNN is

y(t ) = ϕ(x(t )

T middotwx + y(tminus1)T middotwy + b

)

At each time step t this neuron receives the inputs xt as well as itsown output from the previous time step ytminus1 Similarly for a RNNlayer at each time step t every neuron in the layer receives boththe input vector xt and the output vector from the previous timestep ytminus1

Although RNN has potential to learn the contextual dependencyand it is easy to compute its gradients the range of input sequenceis limited in practice due to the vanishing gradient problem Vari-ous efforts were made to tackle this problem since 1990s Amongthem LSTM model as a variant of standard RNN was proposed byHochreiter and Schmidhuber [13]

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

42 Long Short Term Memory (LSTM)

Figure 7 LSTM Cell

Figure 7 illustrates a LSTM [13 19] architecture This complicatedcell has its state split in two vectors h(t ) and c(t ) where h(t ) isthe short-term state and c(t ) the long-term state As the long-termstate c(tminus1) traverses the network it first goes through a forget gatedropping some memories and then it adds some new memories viathe addition operation which adds the memories that are selectedby the input gate The result c(t ) is the output of the cell withoutany further transformation Also after the addition operation thelong-term state is copied and passed through a tanh function h(t )which is equal to the cellrsquos output for this time step y(t ) The mainlayer is the one that outputs д(t ) which analyses the current inputsx(t ) and the previous short term state h(tminus1) The three other layersf(t ) i(t ) and o(t ) are gate controllers they use the logistic activationfunction so their outputs range from 0 to 1 Their output is fedto the element-wise multiplication operations so if they output0s they close the gate and if they output 1s they open it HenceLSTM can learn to recognize an important input store it in long-term state learn to preserve for as long as it is needed and learn toextract it whenever it is needed Below equations summarizes howthe computation in the LSTM cell takes place

i(t ) = σ (WxiT middot x(t ) +Whi

T middot h(tminus1) + bi )

f(t ) = σ (WxfT middot x(t ) +Whf

T middot h(tminus1) + bf )

o(t ) = σ (WxoT middot x(t ) +Who

T middot h(tminus1) + bo )

g(t ) = tanh(WxдT middot x(t ) +Whд

T middot h(tminus1) + bд)c(t ) = f(t ) otimes c(tminus1) + i(t ) otimes g(t )y(t ) = h(t ) = o(t ) otimes tanh(c(t ))

where Whi Whf Who Whд are the weights of each of the fourlayers for their connections to the previous short-term state h(tminus1)whileWxi Wxf Wxo Wxд are the weight matrices of each of thefour layers for connection to the input layer x(t ) The bi bf bo andbд are the bias term for each of the four layer

43 Prediction ModelWhile the original LSTM model is comprised of a single hiddenLSTM layer followed by a standard feed forward output layer thestacked LSTM has multiple hidden LSTM layers where each layercontains multiple memory cells Studies [9 11 15] have shownthat deep LSTM architectures with several hidden layers can buildup progressively higher levels of representations of sequence dataand thus work more effectively The deep LSTM architectures arenetworks with several stacked LSTM hidden layers in which theoutput of a LSTM hidden layer will be fed as the input into thesubsequent LSTM hidden layer We experimented with differentnumber of LSTM layers stacked onto each other and empiricallydetermined that a stacked LSTM with 4 layers works best for ourdata (Figure 8)

The code base and the data for this study will be made availableon our GitHub repository (httpsgithubcomArchitParnamiUAP)for reproducibility and further research

5 TRAFFIC FLOW PREDICTIONThe following subsections describe the traffic flow prediction modelfor a single route ie shortest route from Fire Station 20 to JW ClayRail Station This route connects the city of Charlotte from oneend to another The model can be generalized to predict traffic onany route Furthermore we assume that the traffic data from Wazeie the estimated time of travel between two points obtained fromWaze is the real representation of actual traffic

51 Data RepresentationSince we collect data every 5 minutes for a route we have 288 timeintervals in a 24 hour day For a day i the data xi is

x i = [t1 t2 t288]The training set has 60 days of data Therefore the entire trainingdata can be represented as

X train =

x1x2

x60

We try to predict the traffic for the 61st day (ie the subsequentday) Therefore the test set is the traffic for the next day

X test =[x61

]

52 Data PreprocessingXtrain is flattened into a single list with number of observation equalto 288times60 = 17280 so

X train = [t1 t2 t17280]The training data is then normalized into a range of [01]

X train = Normalize(X train)Similarly the test data is also normalized

X test = Normalize(X test)An LSTM model expects input in the form of a sequence Thelength of the sequence also know as the number of time-steps is

MUD3 Aug 2018 London UK Archit et al

Figure 8 Architecture of the traffic prediction model The first layer in this architecture represents the input layer which contains thetraffic data from the current and previous time intervals This data is then fed into a stacked LSTM model of four layers which is connectedto a dense layer which then outputs the estimated traffic for the next time interval

a hyperparameter that can be set For this problem the length ofsequence is set to 12 That means we train the LSTM model on thepast hour of traffic data continuously and try to predict the trafficin the next 5-minute interval After this transformation the trainingobservations looks like this

x1 = [t1 t2 t12] y1 = [t13]

x2 = [t13 t14 t24] y2 = [t25]

Finally the training set will have the form

X train =

x1x2

xn

Y train =

y1y2

yn

The test data is also split into sequences of length 12 To predictthe 1st time interval of the 61st day we also add traffic informationfrom the previous day ie the last sequence of the 60th day to ourtest data

53 Prediction ResultsThe red line in Figure 9 represents the real traffic where as the blueline represents the traffic predictions made by our LSTM modeldepicted in Figure 8 The y-axis is the ETA in seconds and x-axis isthe time of the day

While the stacked LSTM architecture fits very well for makingtraffic predictions we observed that having a separate stackedLSTMmodel just for weekdays could improve the prediction resultsThis is based on our previous observations that the weekday traffic

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

Figure 9 Real traffic (red) vs predicted traffic (blue) whentrained on all the historic data (Mean Squared Error = 517)

Figure 10 Real traffic vs predicted traffic when trained onlyon weekdays (Mean Squared Error = 461)

pattern differs from theweekend traffic pattern so having a separatemodel would benefit (Figure 10)

Date WeekdayAverageMSE

WeekdayAverageMAPE

All dayAverageMSE

All dayAverageMAPE

051518 459 466 473 473051618 341 451 349 46051718 397 477 40 486051818 379 472 384 487Average 394 466 401 476

Table 4 Mean squared error and mean absolute percentageerror averaged over 10 training and test runs for weekdayand all day prediction models

From Table 4 we can conclude that the Weekday training modelperformed better than all days training model Figure 12 displaysthe average residue for the Weekday model vs the All day modelfor a sample route The green area represents the intervals wherethe average error for prediction was less when using the Weekdaymodel whereas the red area captures the interval when average

error for prediction was high when using the Weekday model Thefigure reaffirms the efficacy of weekday training model

Figure 11 Training modelsrsquo prediction error Green regionsindicate less prediction errors when model was trained only forweekdays and red regions indicate higher prediction errors

54 Comparing results with a MultilayerPerceptron model

The following results were obtained when we trained an artificialneural network with 3 hidden layers (288168) The inputs to themodel was the day of week(1-7) and the time interval (1-288) andthe output was traffic (estimated time of travel)

Figure 12 Real traffic vs Predicted traffic on 051518 ob-tained using a multilayer perceptron model trained on alldays

Date MSE MAPE051518 79 55051618 37 546051718 492 622051818 735 619Average 597 584

Table 5 Mean squared error and mean absolute percentageerrors obtained using multilayer perceptron model

MUD3 Aug 2018 London UK Archit et al

It can be observed that while the multilayer perceptron model isable to learn the general traffic pattern the LSTM model is able tobetter capture the traffic pattern more accurately with lower MSEand MAPE

6 CORRELATING DEVIATIONS IN TRAFFICPREDICTIONWITHWEATHERCONDITIONS

Traffic conditions are dynamic in nature There are many uncon-trollable factors which can cause deviations in traffic patterns Forexample a sudden thunderstorm an accident or a game event There-fore it is certain that an LSTM model can not make accurate predic-tions all the time considering that there are other factors in playwhich could cause sudden change in traffic conditions Here wefocus on one such factor ie weather and try to correlate and see itsinfluence on traffic conditions

Figure 13 and Figure 14 shows the changing weather conditionsat different time intervals For example it rained around 100-200AM and then there was thunderstorm followed by rain at 1200PM The figures also display the real traffic and the LSTM predictedtraffic In certain regions such as when there was a thunderstormthe deviation or the error in prediction was observed to be morethan normal ie when the sky was clear

Figure 13 Peak deviations in traffic correlated with weatherconditions The arrows point at a traffic condition at a par-ticular time and weather

Figure 14 Deviation in traffic prediction with weather Theorange region highlight the interval of thunderstorm and darkgreen region highlight the interval of heavy rain These two regionscorrespond to the highest deviation in traffic prediction

Intuitively and experimentally from Figure 14 it can be said thattraffic conditions are definitely affected by weather conditions Thisimpact of the anomalous weather on traffic can be measured usingLSTM predictions as observed in this study

7 CONCLUSIONIn this paper we have discussed a low-cost data acquisition andfusion model for traffic flow analysis To get high quality traffic dataresearchers [6 10] must rely on government sources for gettingdata from hardware devices (ie loop detectors GPS) In view ofthe large amount of data that was required for this project paidservices like Google maps are often not an option Thus acquiringhigh quality traffic data is one of the challenges in traffic predictionHowever utilizing the power of webscraping on a crowdsourcedwebsite ie Waze we could gather estimated traffic informationin the city OpenWeatherMap API allows us to retrieve relevantinformation regarding possible weather conditions for unexpectedtraffic flows

Our LSTM model has shown good performance for short termprediction it however lacks the capability for making long-termpredictions Utilizing the data fusion engine we aim to build amore sophisticated model that can not only make long term trafficpredictions but could also inform us about the anomalous traf-fic behavior using diverse data sources such as social networksnews headlines and accident logs This will require developmentof an efficient anomaly detection technique that can automate theinforming process

REFERENCES[1] [n d] Google Maps Distance Matrix API httpsdevelopersgooglecommaps

documentationdistance-matrixintro Accessed2018-05-29[2] [n d] OpenWeatherMap httpsopenweathermaporg Accessed2018-05-29[3] [n d] Waze httpswwwwazecom Accessed2018-05-29[4] Afshin Abadi Tooraj Rajabioun and Petros A Ioannou 2015 Traffic flow predic-

tion for road transportation networks with limited traffic data IEEE transactionson intelligent transportation systems 16 2 (2015) 653ndash662

[5] Michael Auli Michel Galley Chris Quirk and Geoffrey Zweig 2013 Joint lan-guage and translation modeling with recurrent neural networks 3 (2013)

[6] Chris Bachmann Baher Abdulhai Matthew J Roorda and Behzad Moshiri 2013A comparative assessment of multi-sensor data fusion techniques for freeway

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

traffic speed estimation using microsimulation modeling Transportation researchpart C emerging technologies 26 (2013) 33ndash48

[7] Zartasha Baloch Faisal Karim Shaikh and Mukhtiar Ali Unar 2018 A context-aware data fusion approach for health-IoT International Journal of InformationTechnology 10 3 (2018) 241ndash245

[8] Yuan-yuan Chen Yisheng Lv Zhenjiang Li and Fei-Yue Wang 2016 Long short-term memory model for traffic congestion prediction with online open dataIn 2016 IEEE 19th International Conference on Intelligent Transportation Systems(ITSC) IEEE 132ndash137

[9] Zhiyong Cui Ruimin Ke and Yinhai Wang 2018 Deep Bidirectional and Uni-directional LSTM Recurrent Neural Network for Network-wide Traffic SpeedPrediction arXiv preprint arXiv180102143 (2018)

[10] Nour-Eddin El Faouzi and Lawrence AKlein 2016 Data fusion for ITS techniquesand research needs Transportation Research Procedia 15 (2016) 495ndash512

[11] Yoav Goldberg 2016 A primer on neural network models for natural languageprocessing Journal of Artificial Intelligence Research 57 (2016) 345ndash420

[12] Jingrui He Wei Shen Phani Divakaruni Laura Wynter and Rick Lawrence2013 Improving Traffic Prediction with Tweet Semantics In Proceedings ofInternational Joint Conference on Artificial Intelligence (IJCAI) 1387ndash1393

[13] Sepp Hochreiter and Juumlrgen Schmidhuber 1997 Long short-termmemory Neuralcomputation 9 8 (1997) 1735ndash1780

[14] Andrej Karpathy and Li Fei-Fei 2015 Deep visual-semantic alignments forgenerating image descriptions In Proceedings of the IEEE conference on computervision and pattern recognition 3128ndash3137

[15] Xiangang Li and Xihong Wu 2015 Constructing long short-term memory baseddeep recurrent neural networks for large vocabulary speech recognition InAcous-tics Speech and Signal Processing (ICASSP) 2015 IEEE International Conference onIEEE 4520ndash4524

[16] Zachary Chase Lipton David C Kale Charles Elkan and Randall C Wetzel2015 Learning to Diagnose with LSTM Recurrent Neural Networks CoRRabs151103677 (2015) arXiv151103677 httparxivorgabs151103677

[17] Yisheng Lv Yanjie Duan Wenwen Kang Zhengxi Li and Fei-Yue Wang 2015Traffic flow prediction with big data a deep learning approach IEEE Transactionson Intelligent Transportation Systems 16 2 (2015) 865ndash873

[18] Nicholas G Polson and Vadim O Sokolov 2017 Deep learning for short-termtraffic flow prediction Transportation Research Part C Emerging Technologies 79(2017) 1ndash17

[19] Haşim Sak Andrew Senior and Franccediloise Beaufays 2014 Long short-term mem-ory based recurrent neural network architectures for large vocabulary speechrecognition arXiv preprint arXiv14021128 (2014)

[20] Ilya Sutskever Oriol Vinyals and Quoc V Le 2014 Sequence to sequence learningwith neural networks In Advances in neural information processing systems 3104ndash3112

[21] Oriol Vinyals Alexander Toshev Samy Bengio and Dumitru Erhan 2015 Showand tell A neural image caption generator In Computer Vision and Pattern Recog-nition (CVPR) 2015 IEEE Conference on IEEE 3156ndash3164

[22] Edward Waltz James Llinas et al 1990 Multisensor data fusion Vol 685 Artechhouse Boston

[23] Wojciech Zaremba Ilya Sutskever and Oriol Vinyals 2014 Recurrent neuralnetwork regularization arXiv preprint arXiv14092329 (2014)

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Data Collection Framework
    • 31 Traffic Data
    • 32 Weather Data
    • 33 Interest Point Selection for Data Collection
    • 34 Sample Data
    • 35 Traffic Analysis
      • 4 Deep Learning-Based Traffic Prediction Model
        • 41 Recurrent Neural Networks
        • 42 Long Short Term Memory (LSTM)
        • 43 Prediction Model
          • 5 Traffic Flow Prediction
            • 51 Data Representation
            • 52 Data Preprocessing
            • 53 Prediction Results
            • 54 Comparing results with a Multilayer Perceptron model
              • 6 Correlating deviations in traffic prediction with weather conditions
              • 7 Conclusion
              • References
Page 3: Deep Learning Based Urban Analytics Platform: Applications ... · Deep Learning Based Urban Analytics Platform: Applications to Traffic Flow Modeling and Prediction MUD3, Aug 2018,

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

Figure 2 Web scraping To obtain traffic information we pe-riodically make GET requests to Waze website from different IPlocations

33 Interest Point Selection for Data CollectionSince we are constrained by the number of servers we can have fordata collection we need to carefully select our source and destina-tion points for traffic data collection The source and destinationpoints should be such that they are evenly spread out through-out the city so that they represent all the traffic in the city Thecity of Charlotte has 42 Fire stations which are evenly spread outthroughout the city These are represented by red in Figure 3 Italso has 26 Light Rail Stations linearly distributed in the middle ofthe city These are represented by blue dots Collecting estimatedtravel durations between Fire Stations and Rail Stations periodi-cally throughout the day will act as a proxy in help us modelingthe inflow and outflow of traffic in the city

Figure 3 Source and destination points for traffic collectionThe city of Charlotte has 42 Fire stations spread throughout thecity (represented by red pentagons) which act as sources and has26 rail stations (represented by blue dots) which act as destinationsfor our traffic data collection

Using the equations listed in the previous section we estimatethe number of serversunique IP that will be required to collectdata every 5 minutes using web scraping The chosen parameterslisted in Table 1

Parameters ValuesN 42M 26R 42 lowast 26 = 1092t (experimental) 05d (experimental) 04L 109T 1092 09 = 9828P 560=300B 60

Slfloor 9828300 minus 60

rfloor+ 1 = 5

Table 1 Data Collection Parameters

The buffer time B is the time in seconds to wait after makingall the requests this could be useful for making any read-writeoperations etc and can be adjusted accordingly

34 Sample DataIn the experiment we are using traffic data (Table 2) that is collectedfrom Waze API calls between a given (fire station rail station) pairover a period of 60 days

Date Clock FireStation

Rail Station Time(s)

Distance(m)

3718 00000 40 I-485SouthBlvd

2243 29885

3718 00002 40 Sharon RoadWest

2195 28668

3718 00004 40 Arrowood 2193 282693718 00005 40 Archdale 2061 264953718 00006 40 Tyvola 1976 258083718 00008 40 Woodlawn 2048 26957

Table 2 Traffic data

At the same time we also collect hourly weather informationusingOpenWeatherMapAPI The data (Table 3) contains the currenttemperature minimum and maximum temperature within an hourhumidity wind speed and weather description (rain light rainthunderstorm sky is clear)

MUD3 Aug 2018 London UK Archit et al

Date Time Temp(K)

TempMin

TempMax

Press-ure

Humi-dity

Weath-er

3718 00000 27864 2771 28015 1008 93 mist3718 10000 27814 27615 27915 1007 93 mist3718 20000 27806 27615 27915 1007 93 mist3718 30000 27831 27715 27915 1006 100 mist3718 40000 27865 27715 28015 1006 100 mist3718 50000 27873 27715 28015 1006 100 mist

Table 3 Weather data

35 Traffic AnalysisIn order to make a sensible prediction we first analyze the trafficpatterns on a selected route throughout the week

Figure 4 Average traffic from Monday to Sunday x-axis rep-resents the the (time of the day) and y-axis represents the estimatedtime of travel in seconds on this route

As seen in Figure 4 the traffic flow pattern on weekdays aresimilar There is a spike in traffic from 700 AM to 800 AM whichcan be attributed to people driving to work The same spike ismissing on Saturdays and Sundays reason it being a work holidayThere is also noticeably less traffic on a late Sunday night ie theearly hours of a Monday

Figure 5 Weekday Pattern Overlapping traffic patterns fromMonday to Friday reveals weekday traffic pattern

Traffic patterns from Monday to Friday when plotted (Figure 5)together reveal the general trend of traffic on weekdays

Figure 6 Weekend Pattern

Saturdays and Sundays have similar traffic patterns except theintensity of traffic is lower on Sundays (Figure 6)

4 DEEP LEARNING-BASED TRAFFICPREDICTION MODEL

In this section we discuss Long Short Term Memory (LSTM) net-works a particular type of Recurrent Neural Network (RNN) andhow it can be used to model for traffic flow prediction

41 Recurrent Neural NetworksRecurrent neural networks (RNN) [23] are similar to feedforwardneural networks except it also has connections pointing backwardsA simple RNN composed of one neuron receiving inputs producingan output and sending that output back to itself The output of thesingle RNN is

y(t ) = ϕ(x(t )

T middotwx + y(tminus1)T middotwy + b

)

At each time step t this neuron receives the inputs xt as well as itsown output from the previous time step ytminus1 Similarly for a RNNlayer at each time step t every neuron in the layer receives boththe input vector xt and the output vector from the previous timestep ytminus1

Although RNN has potential to learn the contextual dependencyand it is easy to compute its gradients the range of input sequenceis limited in practice due to the vanishing gradient problem Vari-ous efforts were made to tackle this problem since 1990s Amongthem LSTM model as a variant of standard RNN was proposed byHochreiter and Schmidhuber [13]

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

42 Long Short Term Memory (LSTM)

Figure 7 LSTM Cell

Figure 7 illustrates a LSTM [13 19] architecture This complicatedcell has its state split in two vectors h(t ) and c(t ) where h(t ) isthe short-term state and c(t ) the long-term state As the long-termstate c(tminus1) traverses the network it first goes through a forget gatedropping some memories and then it adds some new memories viathe addition operation which adds the memories that are selectedby the input gate The result c(t ) is the output of the cell withoutany further transformation Also after the addition operation thelong-term state is copied and passed through a tanh function h(t )which is equal to the cellrsquos output for this time step y(t ) The mainlayer is the one that outputs д(t ) which analyses the current inputsx(t ) and the previous short term state h(tminus1) The three other layersf(t ) i(t ) and o(t ) are gate controllers they use the logistic activationfunction so their outputs range from 0 to 1 Their output is fedto the element-wise multiplication operations so if they output0s they close the gate and if they output 1s they open it HenceLSTM can learn to recognize an important input store it in long-term state learn to preserve for as long as it is needed and learn toextract it whenever it is needed Below equations summarizes howthe computation in the LSTM cell takes place

i(t ) = σ (WxiT middot x(t ) +Whi

T middot h(tminus1) + bi )

f(t ) = σ (WxfT middot x(t ) +Whf

T middot h(tminus1) + bf )

o(t ) = σ (WxoT middot x(t ) +Who

T middot h(tminus1) + bo )

g(t ) = tanh(WxдT middot x(t ) +Whд

T middot h(tminus1) + bд)c(t ) = f(t ) otimes c(tminus1) + i(t ) otimes g(t )y(t ) = h(t ) = o(t ) otimes tanh(c(t ))

where Whi Whf Who Whд are the weights of each of the fourlayers for their connections to the previous short-term state h(tminus1)whileWxi Wxf Wxo Wxд are the weight matrices of each of thefour layers for connection to the input layer x(t ) The bi bf bo andbд are the bias term for each of the four layer

43 Prediction ModelWhile the original LSTM model is comprised of a single hiddenLSTM layer followed by a standard feed forward output layer thestacked LSTM has multiple hidden LSTM layers where each layercontains multiple memory cells Studies [9 11 15] have shownthat deep LSTM architectures with several hidden layers can buildup progressively higher levels of representations of sequence dataand thus work more effectively The deep LSTM architectures arenetworks with several stacked LSTM hidden layers in which theoutput of a LSTM hidden layer will be fed as the input into thesubsequent LSTM hidden layer We experimented with differentnumber of LSTM layers stacked onto each other and empiricallydetermined that a stacked LSTM with 4 layers works best for ourdata (Figure 8)

The code base and the data for this study will be made availableon our GitHub repository (httpsgithubcomArchitParnamiUAP)for reproducibility and further research

5 TRAFFIC FLOW PREDICTIONThe following subsections describe the traffic flow prediction modelfor a single route ie shortest route from Fire Station 20 to JW ClayRail Station This route connects the city of Charlotte from oneend to another The model can be generalized to predict traffic onany route Furthermore we assume that the traffic data from Wazeie the estimated time of travel between two points obtained fromWaze is the real representation of actual traffic

51 Data RepresentationSince we collect data every 5 minutes for a route we have 288 timeintervals in a 24 hour day For a day i the data xi is

x i = [t1 t2 t288]The training set has 60 days of data Therefore the entire trainingdata can be represented as

X train =

x1x2

x60

We try to predict the traffic for the 61st day (ie the subsequentday) Therefore the test set is the traffic for the next day

X test =[x61

]

52 Data PreprocessingXtrain is flattened into a single list with number of observation equalto 288times60 = 17280 so

X train = [t1 t2 t17280]The training data is then normalized into a range of [01]

X train = Normalize(X train)Similarly the test data is also normalized

X test = Normalize(X test)An LSTM model expects input in the form of a sequence Thelength of the sequence also know as the number of time-steps is

MUD3 Aug 2018 London UK Archit et al

Figure 8 Architecture of the traffic prediction model The first layer in this architecture represents the input layer which contains thetraffic data from the current and previous time intervals This data is then fed into a stacked LSTM model of four layers which is connectedto a dense layer which then outputs the estimated traffic for the next time interval

a hyperparameter that can be set For this problem the length ofsequence is set to 12 That means we train the LSTM model on thepast hour of traffic data continuously and try to predict the trafficin the next 5-minute interval After this transformation the trainingobservations looks like this

x1 = [t1 t2 t12] y1 = [t13]

x2 = [t13 t14 t24] y2 = [t25]

Finally the training set will have the form

X train =

x1x2

xn

Y train =

y1y2

yn

The test data is also split into sequences of length 12 To predictthe 1st time interval of the 61st day we also add traffic informationfrom the previous day ie the last sequence of the 60th day to ourtest data

53 Prediction ResultsThe red line in Figure 9 represents the real traffic where as the blueline represents the traffic predictions made by our LSTM modeldepicted in Figure 8 The y-axis is the ETA in seconds and x-axis isthe time of the day

While the stacked LSTM architecture fits very well for makingtraffic predictions we observed that having a separate stackedLSTMmodel just for weekdays could improve the prediction resultsThis is based on our previous observations that the weekday traffic

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

Figure 9 Real traffic (red) vs predicted traffic (blue) whentrained on all the historic data (Mean Squared Error = 517)

Figure 10 Real traffic vs predicted traffic when trained onlyon weekdays (Mean Squared Error = 461)

pattern differs from theweekend traffic pattern so having a separatemodel would benefit (Figure 10)

Date WeekdayAverageMSE

WeekdayAverageMAPE

All dayAverageMSE

All dayAverageMAPE

051518 459 466 473 473051618 341 451 349 46051718 397 477 40 486051818 379 472 384 487Average 394 466 401 476

Table 4 Mean squared error and mean absolute percentageerror averaged over 10 training and test runs for weekdayand all day prediction models

From Table 4 we can conclude that the Weekday training modelperformed better than all days training model Figure 12 displaysthe average residue for the Weekday model vs the All day modelfor a sample route The green area represents the intervals wherethe average error for prediction was less when using the Weekdaymodel whereas the red area captures the interval when average

error for prediction was high when using the Weekday model Thefigure reaffirms the efficacy of weekday training model

Figure 11 Training modelsrsquo prediction error Green regionsindicate less prediction errors when model was trained only forweekdays and red regions indicate higher prediction errors

54 Comparing results with a MultilayerPerceptron model

The following results were obtained when we trained an artificialneural network with 3 hidden layers (288168) The inputs to themodel was the day of week(1-7) and the time interval (1-288) andthe output was traffic (estimated time of travel)

Figure 12 Real traffic vs Predicted traffic on 051518 ob-tained using a multilayer perceptron model trained on alldays

Date MSE MAPE051518 79 55051618 37 546051718 492 622051818 735 619Average 597 584

Table 5 Mean squared error and mean absolute percentageerrors obtained using multilayer perceptron model

MUD3 Aug 2018 London UK Archit et al

It can be observed that while the multilayer perceptron model isable to learn the general traffic pattern the LSTM model is able tobetter capture the traffic pattern more accurately with lower MSEand MAPE

6 CORRELATING DEVIATIONS IN TRAFFICPREDICTIONWITHWEATHERCONDITIONS

Traffic conditions are dynamic in nature There are many uncon-trollable factors which can cause deviations in traffic patterns Forexample a sudden thunderstorm an accident or a game event There-fore it is certain that an LSTM model can not make accurate predic-tions all the time considering that there are other factors in playwhich could cause sudden change in traffic conditions Here wefocus on one such factor ie weather and try to correlate and see itsinfluence on traffic conditions

Figure 13 and Figure 14 shows the changing weather conditionsat different time intervals For example it rained around 100-200AM and then there was thunderstorm followed by rain at 1200PM The figures also display the real traffic and the LSTM predictedtraffic In certain regions such as when there was a thunderstormthe deviation or the error in prediction was observed to be morethan normal ie when the sky was clear

Figure 13 Peak deviations in traffic correlated with weatherconditions The arrows point at a traffic condition at a par-ticular time and weather

Figure 14 Deviation in traffic prediction with weather Theorange region highlight the interval of thunderstorm and darkgreen region highlight the interval of heavy rain These two regionscorrespond to the highest deviation in traffic prediction

Intuitively and experimentally from Figure 14 it can be said thattraffic conditions are definitely affected by weather conditions Thisimpact of the anomalous weather on traffic can be measured usingLSTM predictions as observed in this study

7 CONCLUSIONIn this paper we have discussed a low-cost data acquisition andfusion model for traffic flow analysis To get high quality traffic dataresearchers [6 10] must rely on government sources for gettingdata from hardware devices (ie loop detectors GPS) In view ofthe large amount of data that was required for this project paidservices like Google maps are often not an option Thus acquiringhigh quality traffic data is one of the challenges in traffic predictionHowever utilizing the power of webscraping on a crowdsourcedwebsite ie Waze we could gather estimated traffic informationin the city OpenWeatherMap API allows us to retrieve relevantinformation regarding possible weather conditions for unexpectedtraffic flows

Our LSTM model has shown good performance for short termprediction it however lacks the capability for making long-termpredictions Utilizing the data fusion engine we aim to build amore sophisticated model that can not only make long term trafficpredictions but could also inform us about the anomalous traf-fic behavior using diverse data sources such as social networksnews headlines and accident logs This will require developmentof an efficient anomaly detection technique that can automate theinforming process

REFERENCES[1] [n d] Google Maps Distance Matrix API httpsdevelopersgooglecommaps

documentationdistance-matrixintro Accessed2018-05-29[2] [n d] OpenWeatherMap httpsopenweathermaporg Accessed2018-05-29[3] [n d] Waze httpswwwwazecom Accessed2018-05-29[4] Afshin Abadi Tooraj Rajabioun and Petros A Ioannou 2015 Traffic flow predic-

tion for road transportation networks with limited traffic data IEEE transactionson intelligent transportation systems 16 2 (2015) 653ndash662

[5] Michael Auli Michel Galley Chris Quirk and Geoffrey Zweig 2013 Joint lan-guage and translation modeling with recurrent neural networks 3 (2013)

[6] Chris Bachmann Baher Abdulhai Matthew J Roorda and Behzad Moshiri 2013A comparative assessment of multi-sensor data fusion techniques for freeway

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

traffic speed estimation using microsimulation modeling Transportation researchpart C emerging technologies 26 (2013) 33ndash48

[7] Zartasha Baloch Faisal Karim Shaikh and Mukhtiar Ali Unar 2018 A context-aware data fusion approach for health-IoT International Journal of InformationTechnology 10 3 (2018) 241ndash245

[8] Yuan-yuan Chen Yisheng Lv Zhenjiang Li and Fei-Yue Wang 2016 Long short-term memory model for traffic congestion prediction with online open dataIn 2016 IEEE 19th International Conference on Intelligent Transportation Systems(ITSC) IEEE 132ndash137

[9] Zhiyong Cui Ruimin Ke and Yinhai Wang 2018 Deep Bidirectional and Uni-directional LSTM Recurrent Neural Network for Network-wide Traffic SpeedPrediction arXiv preprint arXiv180102143 (2018)

[10] Nour-Eddin El Faouzi and Lawrence AKlein 2016 Data fusion for ITS techniquesand research needs Transportation Research Procedia 15 (2016) 495ndash512

[11] Yoav Goldberg 2016 A primer on neural network models for natural languageprocessing Journal of Artificial Intelligence Research 57 (2016) 345ndash420

[12] Jingrui He Wei Shen Phani Divakaruni Laura Wynter and Rick Lawrence2013 Improving Traffic Prediction with Tweet Semantics In Proceedings ofInternational Joint Conference on Artificial Intelligence (IJCAI) 1387ndash1393

[13] Sepp Hochreiter and Juumlrgen Schmidhuber 1997 Long short-termmemory Neuralcomputation 9 8 (1997) 1735ndash1780

[14] Andrej Karpathy and Li Fei-Fei 2015 Deep visual-semantic alignments forgenerating image descriptions In Proceedings of the IEEE conference on computervision and pattern recognition 3128ndash3137

[15] Xiangang Li and Xihong Wu 2015 Constructing long short-term memory baseddeep recurrent neural networks for large vocabulary speech recognition InAcous-tics Speech and Signal Processing (ICASSP) 2015 IEEE International Conference onIEEE 4520ndash4524

[16] Zachary Chase Lipton David C Kale Charles Elkan and Randall C Wetzel2015 Learning to Diagnose with LSTM Recurrent Neural Networks CoRRabs151103677 (2015) arXiv151103677 httparxivorgabs151103677

[17] Yisheng Lv Yanjie Duan Wenwen Kang Zhengxi Li and Fei-Yue Wang 2015Traffic flow prediction with big data a deep learning approach IEEE Transactionson Intelligent Transportation Systems 16 2 (2015) 865ndash873

[18] Nicholas G Polson and Vadim O Sokolov 2017 Deep learning for short-termtraffic flow prediction Transportation Research Part C Emerging Technologies 79(2017) 1ndash17

[19] Haşim Sak Andrew Senior and Franccediloise Beaufays 2014 Long short-term mem-ory based recurrent neural network architectures for large vocabulary speechrecognition arXiv preprint arXiv14021128 (2014)

[20] Ilya Sutskever Oriol Vinyals and Quoc V Le 2014 Sequence to sequence learningwith neural networks In Advances in neural information processing systems 3104ndash3112

[21] Oriol Vinyals Alexander Toshev Samy Bengio and Dumitru Erhan 2015 Showand tell A neural image caption generator In Computer Vision and Pattern Recog-nition (CVPR) 2015 IEEE Conference on IEEE 3156ndash3164

[22] Edward Waltz James Llinas et al 1990 Multisensor data fusion Vol 685 Artechhouse Boston

[23] Wojciech Zaremba Ilya Sutskever and Oriol Vinyals 2014 Recurrent neuralnetwork regularization arXiv preprint arXiv14092329 (2014)

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Data Collection Framework
    • 31 Traffic Data
    • 32 Weather Data
    • 33 Interest Point Selection for Data Collection
    • 34 Sample Data
    • 35 Traffic Analysis
      • 4 Deep Learning-Based Traffic Prediction Model
        • 41 Recurrent Neural Networks
        • 42 Long Short Term Memory (LSTM)
        • 43 Prediction Model
          • 5 Traffic Flow Prediction
            • 51 Data Representation
            • 52 Data Preprocessing
            • 53 Prediction Results
            • 54 Comparing results with a Multilayer Perceptron model
              • 6 Correlating deviations in traffic prediction with weather conditions
              • 7 Conclusion
              • References
Page 4: Deep Learning Based Urban Analytics Platform: Applications ... · Deep Learning Based Urban Analytics Platform: Applications to Traffic Flow Modeling and Prediction MUD3, Aug 2018,

MUD3 Aug 2018 London UK Archit et al

Date Time Temp(K)

TempMin

TempMax

Press-ure

Humi-dity

Weath-er

3718 00000 27864 2771 28015 1008 93 mist3718 10000 27814 27615 27915 1007 93 mist3718 20000 27806 27615 27915 1007 93 mist3718 30000 27831 27715 27915 1006 100 mist3718 40000 27865 27715 28015 1006 100 mist3718 50000 27873 27715 28015 1006 100 mist

Table 3 Weather data

35 Traffic AnalysisIn order to make a sensible prediction we first analyze the trafficpatterns on a selected route throughout the week

Figure 4 Average traffic from Monday to Sunday x-axis rep-resents the the (time of the day) and y-axis represents the estimatedtime of travel in seconds on this route

As seen in Figure 4 the traffic flow pattern on weekdays aresimilar There is a spike in traffic from 700 AM to 800 AM whichcan be attributed to people driving to work The same spike ismissing on Saturdays and Sundays reason it being a work holidayThere is also noticeably less traffic on a late Sunday night ie theearly hours of a Monday

Figure 5 Weekday Pattern Overlapping traffic patterns fromMonday to Friday reveals weekday traffic pattern

Traffic patterns from Monday to Friday when plotted (Figure 5)together reveal the general trend of traffic on weekdays

Figure 6 Weekend Pattern

Saturdays and Sundays have similar traffic patterns except theintensity of traffic is lower on Sundays (Figure 6)

4 DEEP LEARNING-BASED TRAFFICPREDICTION MODEL

In this section we discuss Long Short Term Memory (LSTM) net-works a particular type of Recurrent Neural Network (RNN) andhow it can be used to model for traffic flow prediction

41 Recurrent Neural NetworksRecurrent neural networks (RNN) [23] are similar to feedforwardneural networks except it also has connections pointing backwardsA simple RNN composed of one neuron receiving inputs producingan output and sending that output back to itself The output of thesingle RNN is

y(t ) = ϕ(x(t )

T middotwx + y(tminus1)T middotwy + b

)

At each time step t this neuron receives the inputs xt as well as itsown output from the previous time step ytminus1 Similarly for a RNNlayer at each time step t every neuron in the layer receives boththe input vector xt and the output vector from the previous timestep ytminus1

Although RNN has potential to learn the contextual dependencyand it is easy to compute its gradients the range of input sequenceis limited in practice due to the vanishing gradient problem Vari-ous efforts were made to tackle this problem since 1990s Amongthem LSTM model as a variant of standard RNN was proposed byHochreiter and Schmidhuber [13]

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

42 Long Short Term Memory (LSTM)

Figure 7 LSTM Cell

Figure 7 illustrates a LSTM [13 19] architecture This complicatedcell has its state split in two vectors h(t ) and c(t ) where h(t ) isthe short-term state and c(t ) the long-term state As the long-termstate c(tminus1) traverses the network it first goes through a forget gatedropping some memories and then it adds some new memories viathe addition operation which adds the memories that are selectedby the input gate The result c(t ) is the output of the cell withoutany further transformation Also after the addition operation thelong-term state is copied and passed through a tanh function h(t )which is equal to the cellrsquos output for this time step y(t ) The mainlayer is the one that outputs д(t ) which analyses the current inputsx(t ) and the previous short term state h(tminus1) The three other layersf(t ) i(t ) and o(t ) are gate controllers they use the logistic activationfunction so their outputs range from 0 to 1 Their output is fedto the element-wise multiplication operations so if they output0s they close the gate and if they output 1s they open it HenceLSTM can learn to recognize an important input store it in long-term state learn to preserve for as long as it is needed and learn toextract it whenever it is needed Below equations summarizes howthe computation in the LSTM cell takes place

i(t ) = σ (WxiT middot x(t ) +Whi

T middot h(tminus1) + bi )

f(t ) = σ (WxfT middot x(t ) +Whf

T middot h(tminus1) + bf )

o(t ) = σ (WxoT middot x(t ) +Who

T middot h(tminus1) + bo )

g(t ) = tanh(WxдT middot x(t ) +Whд

T middot h(tminus1) + bд)c(t ) = f(t ) otimes c(tminus1) + i(t ) otimes g(t )y(t ) = h(t ) = o(t ) otimes tanh(c(t ))

where Whi Whf Who Whд are the weights of each of the fourlayers for their connections to the previous short-term state h(tminus1)whileWxi Wxf Wxo Wxд are the weight matrices of each of thefour layers for connection to the input layer x(t ) The bi bf bo andbд are the bias term for each of the four layer

43 Prediction ModelWhile the original LSTM model is comprised of a single hiddenLSTM layer followed by a standard feed forward output layer thestacked LSTM has multiple hidden LSTM layers where each layercontains multiple memory cells Studies [9 11 15] have shownthat deep LSTM architectures with several hidden layers can buildup progressively higher levels of representations of sequence dataand thus work more effectively The deep LSTM architectures arenetworks with several stacked LSTM hidden layers in which theoutput of a LSTM hidden layer will be fed as the input into thesubsequent LSTM hidden layer We experimented with differentnumber of LSTM layers stacked onto each other and empiricallydetermined that a stacked LSTM with 4 layers works best for ourdata (Figure 8)

The code base and the data for this study will be made availableon our GitHub repository (httpsgithubcomArchitParnamiUAP)for reproducibility and further research

5 TRAFFIC FLOW PREDICTIONThe following subsections describe the traffic flow prediction modelfor a single route ie shortest route from Fire Station 20 to JW ClayRail Station This route connects the city of Charlotte from oneend to another The model can be generalized to predict traffic onany route Furthermore we assume that the traffic data from Wazeie the estimated time of travel between two points obtained fromWaze is the real representation of actual traffic

51 Data RepresentationSince we collect data every 5 minutes for a route we have 288 timeintervals in a 24 hour day For a day i the data xi is

x i = [t1 t2 t288]The training set has 60 days of data Therefore the entire trainingdata can be represented as

X train =

x1x2

x60

We try to predict the traffic for the 61st day (ie the subsequentday) Therefore the test set is the traffic for the next day

X test =[x61

]

52 Data PreprocessingXtrain is flattened into a single list with number of observation equalto 288times60 = 17280 so

X train = [t1 t2 t17280]The training data is then normalized into a range of [01]

X train = Normalize(X train)Similarly the test data is also normalized

X test = Normalize(X test)An LSTM model expects input in the form of a sequence Thelength of the sequence also know as the number of time-steps is

MUD3 Aug 2018 London UK Archit et al

Figure 8 Architecture of the traffic prediction model The first layer in this architecture represents the input layer which contains thetraffic data from the current and previous time intervals This data is then fed into a stacked LSTM model of four layers which is connectedto a dense layer which then outputs the estimated traffic for the next time interval

a hyperparameter that can be set For this problem the length ofsequence is set to 12 That means we train the LSTM model on thepast hour of traffic data continuously and try to predict the trafficin the next 5-minute interval After this transformation the trainingobservations looks like this

x1 = [t1 t2 t12] y1 = [t13]

x2 = [t13 t14 t24] y2 = [t25]

Finally the training set will have the form

X train =

x1x2

xn

Y train =

y1y2

yn

The test data is also split into sequences of length 12 To predictthe 1st time interval of the 61st day we also add traffic informationfrom the previous day ie the last sequence of the 60th day to ourtest data

53 Prediction ResultsThe red line in Figure 9 represents the real traffic where as the blueline represents the traffic predictions made by our LSTM modeldepicted in Figure 8 The y-axis is the ETA in seconds and x-axis isthe time of the day

While the stacked LSTM architecture fits very well for makingtraffic predictions we observed that having a separate stackedLSTMmodel just for weekdays could improve the prediction resultsThis is based on our previous observations that the weekday traffic

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

Figure 9 Real traffic (red) vs predicted traffic (blue) whentrained on all the historic data (Mean Squared Error = 517)

Figure 10 Real traffic vs predicted traffic when trained onlyon weekdays (Mean Squared Error = 461)

pattern differs from theweekend traffic pattern so having a separatemodel would benefit (Figure 10)

Date WeekdayAverageMSE

WeekdayAverageMAPE

All dayAverageMSE

All dayAverageMAPE

051518 459 466 473 473051618 341 451 349 46051718 397 477 40 486051818 379 472 384 487Average 394 466 401 476

Table 4 Mean squared error and mean absolute percentageerror averaged over 10 training and test runs for weekdayand all day prediction models

From Table 4 we can conclude that the Weekday training modelperformed better than all days training model Figure 12 displaysthe average residue for the Weekday model vs the All day modelfor a sample route The green area represents the intervals wherethe average error for prediction was less when using the Weekdaymodel whereas the red area captures the interval when average

error for prediction was high when using the Weekday model Thefigure reaffirms the efficacy of weekday training model

Figure 11 Training modelsrsquo prediction error Green regionsindicate less prediction errors when model was trained only forweekdays and red regions indicate higher prediction errors

54 Comparing results with a MultilayerPerceptron model

The following results were obtained when we trained an artificialneural network with 3 hidden layers (288168) The inputs to themodel was the day of week(1-7) and the time interval (1-288) andthe output was traffic (estimated time of travel)

Figure 12 Real traffic vs Predicted traffic on 051518 ob-tained using a multilayer perceptron model trained on alldays

Date MSE MAPE051518 79 55051618 37 546051718 492 622051818 735 619Average 597 584

Table 5 Mean squared error and mean absolute percentageerrors obtained using multilayer perceptron model

MUD3 Aug 2018 London UK Archit et al

It can be observed that while the multilayer perceptron model isable to learn the general traffic pattern the LSTM model is able tobetter capture the traffic pattern more accurately with lower MSEand MAPE

6 CORRELATING DEVIATIONS IN TRAFFICPREDICTIONWITHWEATHERCONDITIONS

Traffic conditions are dynamic in nature There are many uncon-trollable factors which can cause deviations in traffic patterns Forexample a sudden thunderstorm an accident or a game event There-fore it is certain that an LSTM model can not make accurate predic-tions all the time considering that there are other factors in playwhich could cause sudden change in traffic conditions Here wefocus on one such factor ie weather and try to correlate and see itsinfluence on traffic conditions

Figure 13 and Figure 14 shows the changing weather conditionsat different time intervals For example it rained around 100-200AM and then there was thunderstorm followed by rain at 1200PM The figures also display the real traffic and the LSTM predictedtraffic In certain regions such as when there was a thunderstormthe deviation or the error in prediction was observed to be morethan normal ie when the sky was clear

Figure 13 Peak deviations in traffic correlated with weatherconditions The arrows point at a traffic condition at a par-ticular time and weather

Figure 14 Deviation in traffic prediction with weather Theorange region highlight the interval of thunderstorm and darkgreen region highlight the interval of heavy rain These two regionscorrespond to the highest deviation in traffic prediction

Intuitively and experimentally from Figure 14 it can be said thattraffic conditions are definitely affected by weather conditions Thisimpact of the anomalous weather on traffic can be measured usingLSTM predictions as observed in this study

7 CONCLUSIONIn this paper we have discussed a low-cost data acquisition andfusion model for traffic flow analysis To get high quality traffic dataresearchers [6 10] must rely on government sources for gettingdata from hardware devices (ie loop detectors GPS) In view ofthe large amount of data that was required for this project paidservices like Google maps are often not an option Thus acquiringhigh quality traffic data is one of the challenges in traffic predictionHowever utilizing the power of webscraping on a crowdsourcedwebsite ie Waze we could gather estimated traffic informationin the city OpenWeatherMap API allows us to retrieve relevantinformation regarding possible weather conditions for unexpectedtraffic flows

Our LSTM model has shown good performance for short termprediction it however lacks the capability for making long-termpredictions Utilizing the data fusion engine we aim to build amore sophisticated model that can not only make long term trafficpredictions but could also inform us about the anomalous traf-fic behavior using diverse data sources such as social networksnews headlines and accident logs This will require developmentof an efficient anomaly detection technique that can automate theinforming process

REFERENCES[1] [n d] Google Maps Distance Matrix API httpsdevelopersgooglecommaps

documentationdistance-matrixintro Accessed2018-05-29[2] [n d] OpenWeatherMap httpsopenweathermaporg Accessed2018-05-29[3] [n d] Waze httpswwwwazecom Accessed2018-05-29[4] Afshin Abadi Tooraj Rajabioun and Petros A Ioannou 2015 Traffic flow predic-

tion for road transportation networks with limited traffic data IEEE transactionson intelligent transportation systems 16 2 (2015) 653ndash662

[5] Michael Auli Michel Galley Chris Quirk and Geoffrey Zweig 2013 Joint lan-guage and translation modeling with recurrent neural networks 3 (2013)

[6] Chris Bachmann Baher Abdulhai Matthew J Roorda and Behzad Moshiri 2013A comparative assessment of multi-sensor data fusion techniques for freeway

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

traffic speed estimation using microsimulation modeling Transportation researchpart C emerging technologies 26 (2013) 33ndash48

[7] Zartasha Baloch Faisal Karim Shaikh and Mukhtiar Ali Unar 2018 A context-aware data fusion approach for health-IoT International Journal of InformationTechnology 10 3 (2018) 241ndash245

[8] Yuan-yuan Chen Yisheng Lv Zhenjiang Li and Fei-Yue Wang 2016 Long short-term memory model for traffic congestion prediction with online open dataIn 2016 IEEE 19th International Conference on Intelligent Transportation Systems(ITSC) IEEE 132ndash137

[9] Zhiyong Cui Ruimin Ke and Yinhai Wang 2018 Deep Bidirectional and Uni-directional LSTM Recurrent Neural Network for Network-wide Traffic SpeedPrediction arXiv preprint arXiv180102143 (2018)

[10] Nour-Eddin El Faouzi and Lawrence AKlein 2016 Data fusion for ITS techniquesand research needs Transportation Research Procedia 15 (2016) 495ndash512

[11] Yoav Goldberg 2016 A primer on neural network models for natural languageprocessing Journal of Artificial Intelligence Research 57 (2016) 345ndash420

[12] Jingrui He Wei Shen Phani Divakaruni Laura Wynter and Rick Lawrence2013 Improving Traffic Prediction with Tweet Semantics In Proceedings ofInternational Joint Conference on Artificial Intelligence (IJCAI) 1387ndash1393

[13] Sepp Hochreiter and Juumlrgen Schmidhuber 1997 Long short-termmemory Neuralcomputation 9 8 (1997) 1735ndash1780

[14] Andrej Karpathy and Li Fei-Fei 2015 Deep visual-semantic alignments forgenerating image descriptions In Proceedings of the IEEE conference on computervision and pattern recognition 3128ndash3137

[15] Xiangang Li and Xihong Wu 2015 Constructing long short-term memory baseddeep recurrent neural networks for large vocabulary speech recognition InAcous-tics Speech and Signal Processing (ICASSP) 2015 IEEE International Conference onIEEE 4520ndash4524

[16] Zachary Chase Lipton David C Kale Charles Elkan and Randall C Wetzel2015 Learning to Diagnose with LSTM Recurrent Neural Networks CoRRabs151103677 (2015) arXiv151103677 httparxivorgabs151103677

[17] Yisheng Lv Yanjie Duan Wenwen Kang Zhengxi Li and Fei-Yue Wang 2015Traffic flow prediction with big data a deep learning approach IEEE Transactionson Intelligent Transportation Systems 16 2 (2015) 865ndash873

[18] Nicholas G Polson and Vadim O Sokolov 2017 Deep learning for short-termtraffic flow prediction Transportation Research Part C Emerging Technologies 79(2017) 1ndash17

[19] Haşim Sak Andrew Senior and Franccediloise Beaufays 2014 Long short-term mem-ory based recurrent neural network architectures for large vocabulary speechrecognition arXiv preprint arXiv14021128 (2014)

[20] Ilya Sutskever Oriol Vinyals and Quoc V Le 2014 Sequence to sequence learningwith neural networks In Advances in neural information processing systems 3104ndash3112

[21] Oriol Vinyals Alexander Toshev Samy Bengio and Dumitru Erhan 2015 Showand tell A neural image caption generator In Computer Vision and Pattern Recog-nition (CVPR) 2015 IEEE Conference on IEEE 3156ndash3164

[22] Edward Waltz James Llinas et al 1990 Multisensor data fusion Vol 685 Artechhouse Boston

[23] Wojciech Zaremba Ilya Sutskever and Oriol Vinyals 2014 Recurrent neuralnetwork regularization arXiv preprint arXiv14092329 (2014)

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Data Collection Framework
    • 31 Traffic Data
    • 32 Weather Data
    • 33 Interest Point Selection for Data Collection
    • 34 Sample Data
    • 35 Traffic Analysis
      • 4 Deep Learning-Based Traffic Prediction Model
        • 41 Recurrent Neural Networks
        • 42 Long Short Term Memory (LSTM)
        • 43 Prediction Model
          • 5 Traffic Flow Prediction
            • 51 Data Representation
            • 52 Data Preprocessing
            • 53 Prediction Results
            • 54 Comparing results with a Multilayer Perceptron model
              • 6 Correlating deviations in traffic prediction with weather conditions
              • 7 Conclusion
              • References
Page 5: Deep Learning Based Urban Analytics Platform: Applications ... · Deep Learning Based Urban Analytics Platform: Applications to Traffic Flow Modeling and Prediction MUD3, Aug 2018,

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

42 Long Short Term Memory (LSTM)

Figure 7 LSTM Cell

Figure 7 illustrates a LSTM [13 19] architecture This complicatedcell has its state split in two vectors h(t ) and c(t ) where h(t ) isthe short-term state and c(t ) the long-term state As the long-termstate c(tminus1) traverses the network it first goes through a forget gatedropping some memories and then it adds some new memories viathe addition operation which adds the memories that are selectedby the input gate The result c(t ) is the output of the cell withoutany further transformation Also after the addition operation thelong-term state is copied and passed through a tanh function h(t )which is equal to the cellrsquos output for this time step y(t ) The mainlayer is the one that outputs д(t ) which analyses the current inputsx(t ) and the previous short term state h(tminus1) The three other layersf(t ) i(t ) and o(t ) are gate controllers they use the logistic activationfunction so their outputs range from 0 to 1 Their output is fedto the element-wise multiplication operations so if they output0s they close the gate and if they output 1s they open it HenceLSTM can learn to recognize an important input store it in long-term state learn to preserve for as long as it is needed and learn toextract it whenever it is needed Below equations summarizes howthe computation in the LSTM cell takes place

i(t ) = σ (WxiT middot x(t ) +Whi

T middot h(tminus1) + bi )

f(t ) = σ (WxfT middot x(t ) +Whf

T middot h(tminus1) + bf )

o(t ) = σ (WxoT middot x(t ) +Who

T middot h(tminus1) + bo )

g(t ) = tanh(WxдT middot x(t ) +Whд

T middot h(tminus1) + bд)c(t ) = f(t ) otimes c(tminus1) + i(t ) otimes g(t )y(t ) = h(t ) = o(t ) otimes tanh(c(t ))

where Whi Whf Who Whд are the weights of each of the fourlayers for their connections to the previous short-term state h(tminus1)whileWxi Wxf Wxo Wxд are the weight matrices of each of thefour layers for connection to the input layer x(t ) The bi bf bo andbд are the bias term for each of the four layer

43 Prediction ModelWhile the original LSTM model is comprised of a single hiddenLSTM layer followed by a standard feed forward output layer thestacked LSTM has multiple hidden LSTM layers where each layercontains multiple memory cells Studies [9 11 15] have shownthat deep LSTM architectures with several hidden layers can buildup progressively higher levels of representations of sequence dataand thus work more effectively The deep LSTM architectures arenetworks with several stacked LSTM hidden layers in which theoutput of a LSTM hidden layer will be fed as the input into thesubsequent LSTM hidden layer We experimented with differentnumber of LSTM layers stacked onto each other and empiricallydetermined that a stacked LSTM with 4 layers works best for ourdata (Figure 8)

The code base and the data for this study will be made availableon our GitHub repository (httpsgithubcomArchitParnamiUAP)for reproducibility and further research

5 TRAFFIC FLOW PREDICTIONThe following subsections describe the traffic flow prediction modelfor a single route ie shortest route from Fire Station 20 to JW ClayRail Station This route connects the city of Charlotte from oneend to another The model can be generalized to predict traffic onany route Furthermore we assume that the traffic data from Wazeie the estimated time of travel between two points obtained fromWaze is the real representation of actual traffic

51 Data RepresentationSince we collect data every 5 minutes for a route we have 288 timeintervals in a 24 hour day For a day i the data xi is

x i = [t1 t2 t288]The training set has 60 days of data Therefore the entire trainingdata can be represented as

X train =

x1x2

x60

We try to predict the traffic for the 61st day (ie the subsequentday) Therefore the test set is the traffic for the next day

X test =[x61

]

52 Data PreprocessingXtrain is flattened into a single list with number of observation equalto 288times60 = 17280 so

X train = [t1 t2 t17280]The training data is then normalized into a range of [01]

X train = Normalize(X train)Similarly the test data is also normalized

X test = Normalize(X test)An LSTM model expects input in the form of a sequence Thelength of the sequence also know as the number of time-steps is

MUD3 Aug 2018 London UK Archit et al

Figure 8 Architecture of the traffic prediction model The first layer in this architecture represents the input layer which contains thetraffic data from the current and previous time intervals This data is then fed into a stacked LSTM model of four layers which is connectedto a dense layer which then outputs the estimated traffic for the next time interval

a hyperparameter that can be set For this problem the length ofsequence is set to 12 That means we train the LSTM model on thepast hour of traffic data continuously and try to predict the trafficin the next 5-minute interval After this transformation the trainingobservations looks like this

x1 = [t1 t2 t12] y1 = [t13]

x2 = [t13 t14 t24] y2 = [t25]

Finally the training set will have the form

X train =

x1x2

xn

Y train =

y1y2

yn

The test data is also split into sequences of length 12 To predictthe 1st time interval of the 61st day we also add traffic informationfrom the previous day ie the last sequence of the 60th day to ourtest data

53 Prediction ResultsThe red line in Figure 9 represents the real traffic where as the blueline represents the traffic predictions made by our LSTM modeldepicted in Figure 8 The y-axis is the ETA in seconds and x-axis isthe time of the day

While the stacked LSTM architecture fits very well for makingtraffic predictions we observed that having a separate stackedLSTMmodel just for weekdays could improve the prediction resultsThis is based on our previous observations that the weekday traffic

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

Figure 9 Real traffic (red) vs predicted traffic (blue) whentrained on all the historic data (Mean Squared Error = 517)

Figure 10 Real traffic vs predicted traffic when trained onlyon weekdays (Mean Squared Error = 461)

pattern differs from theweekend traffic pattern so having a separatemodel would benefit (Figure 10)

Date WeekdayAverageMSE

WeekdayAverageMAPE

All dayAverageMSE

All dayAverageMAPE

051518 459 466 473 473051618 341 451 349 46051718 397 477 40 486051818 379 472 384 487Average 394 466 401 476

Table 4 Mean squared error and mean absolute percentageerror averaged over 10 training and test runs for weekdayand all day prediction models

From Table 4 we can conclude that the Weekday training modelperformed better than all days training model Figure 12 displaysthe average residue for the Weekday model vs the All day modelfor a sample route The green area represents the intervals wherethe average error for prediction was less when using the Weekdaymodel whereas the red area captures the interval when average

error for prediction was high when using the Weekday model Thefigure reaffirms the efficacy of weekday training model

Figure 11 Training modelsrsquo prediction error Green regionsindicate less prediction errors when model was trained only forweekdays and red regions indicate higher prediction errors

54 Comparing results with a MultilayerPerceptron model

The following results were obtained when we trained an artificialneural network with 3 hidden layers (288168) The inputs to themodel was the day of week(1-7) and the time interval (1-288) andthe output was traffic (estimated time of travel)

Figure 12 Real traffic vs Predicted traffic on 051518 ob-tained using a multilayer perceptron model trained on alldays

Date MSE MAPE051518 79 55051618 37 546051718 492 622051818 735 619Average 597 584

Table 5 Mean squared error and mean absolute percentageerrors obtained using multilayer perceptron model

MUD3 Aug 2018 London UK Archit et al

It can be observed that while the multilayer perceptron model isable to learn the general traffic pattern the LSTM model is able tobetter capture the traffic pattern more accurately with lower MSEand MAPE

6 CORRELATING DEVIATIONS IN TRAFFICPREDICTIONWITHWEATHERCONDITIONS

Traffic conditions are dynamic in nature There are many uncon-trollable factors which can cause deviations in traffic patterns Forexample a sudden thunderstorm an accident or a game event There-fore it is certain that an LSTM model can not make accurate predic-tions all the time considering that there are other factors in playwhich could cause sudden change in traffic conditions Here wefocus on one such factor ie weather and try to correlate and see itsinfluence on traffic conditions

Figure 13 and Figure 14 shows the changing weather conditionsat different time intervals For example it rained around 100-200AM and then there was thunderstorm followed by rain at 1200PM The figures also display the real traffic and the LSTM predictedtraffic In certain regions such as when there was a thunderstormthe deviation or the error in prediction was observed to be morethan normal ie when the sky was clear

Figure 13 Peak deviations in traffic correlated with weatherconditions The arrows point at a traffic condition at a par-ticular time and weather

Figure 14 Deviation in traffic prediction with weather Theorange region highlight the interval of thunderstorm and darkgreen region highlight the interval of heavy rain These two regionscorrespond to the highest deviation in traffic prediction

Intuitively and experimentally from Figure 14 it can be said thattraffic conditions are definitely affected by weather conditions Thisimpact of the anomalous weather on traffic can be measured usingLSTM predictions as observed in this study

7 CONCLUSIONIn this paper we have discussed a low-cost data acquisition andfusion model for traffic flow analysis To get high quality traffic dataresearchers [6 10] must rely on government sources for gettingdata from hardware devices (ie loop detectors GPS) In view ofthe large amount of data that was required for this project paidservices like Google maps are often not an option Thus acquiringhigh quality traffic data is one of the challenges in traffic predictionHowever utilizing the power of webscraping on a crowdsourcedwebsite ie Waze we could gather estimated traffic informationin the city OpenWeatherMap API allows us to retrieve relevantinformation regarding possible weather conditions for unexpectedtraffic flows

Our LSTM model has shown good performance for short termprediction it however lacks the capability for making long-termpredictions Utilizing the data fusion engine we aim to build amore sophisticated model that can not only make long term trafficpredictions but could also inform us about the anomalous traf-fic behavior using diverse data sources such as social networksnews headlines and accident logs This will require developmentof an efficient anomaly detection technique that can automate theinforming process

REFERENCES[1] [n d] Google Maps Distance Matrix API httpsdevelopersgooglecommaps

documentationdistance-matrixintro Accessed2018-05-29[2] [n d] OpenWeatherMap httpsopenweathermaporg Accessed2018-05-29[3] [n d] Waze httpswwwwazecom Accessed2018-05-29[4] Afshin Abadi Tooraj Rajabioun and Petros A Ioannou 2015 Traffic flow predic-

tion for road transportation networks with limited traffic data IEEE transactionson intelligent transportation systems 16 2 (2015) 653ndash662

[5] Michael Auli Michel Galley Chris Quirk and Geoffrey Zweig 2013 Joint lan-guage and translation modeling with recurrent neural networks 3 (2013)

[6] Chris Bachmann Baher Abdulhai Matthew J Roorda and Behzad Moshiri 2013A comparative assessment of multi-sensor data fusion techniques for freeway

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

traffic speed estimation using microsimulation modeling Transportation researchpart C emerging technologies 26 (2013) 33ndash48

[7] Zartasha Baloch Faisal Karim Shaikh and Mukhtiar Ali Unar 2018 A context-aware data fusion approach for health-IoT International Journal of InformationTechnology 10 3 (2018) 241ndash245

[8] Yuan-yuan Chen Yisheng Lv Zhenjiang Li and Fei-Yue Wang 2016 Long short-term memory model for traffic congestion prediction with online open dataIn 2016 IEEE 19th International Conference on Intelligent Transportation Systems(ITSC) IEEE 132ndash137

[9] Zhiyong Cui Ruimin Ke and Yinhai Wang 2018 Deep Bidirectional and Uni-directional LSTM Recurrent Neural Network for Network-wide Traffic SpeedPrediction arXiv preprint arXiv180102143 (2018)

[10] Nour-Eddin El Faouzi and Lawrence AKlein 2016 Data fusion for ITS techniquesand research needs Transportation Research Procedia 15 (2016) 495ndash512

[11] Yoav Goldberg 2016 A primer on neural network models for natural languageprocessing Journal of Artificial Intelligence Research 57 (2016) 345ndash420

[12] Jingrui He Wei Shen Phani Divakaruni Laura Wynter and Rick Lawrence2013 Improving Traffic Prediction with Tweet Semantics In Proceedings ofInternational Joint Conference on Artificial Intelligence (IJCAI) 1387ndash1393

[13] Sepp Hochreiter and Juumlrgen Schmidhuber 1997 Long short-termmemory Neuralcomputation 9 8 (1997) 1735ndash1780

[14] Andrej Karpathy and Li Fei-Fei 2015 Deep visual-semantic alignments forgenerating image descriptions In Proceedings of the IEEE conference on computervision and pattern recognition 3128ndash3137

[15] Xiangang Li and Xihong Wu 2015 Constructing long short-term memory baseddeep recurrent neural networks for large vocabulary speech recognition InAcous-tics Speech and Signal Processing (ICASSP) 2015 IEEE International Conference onIEEE 4520ndash4524

[16] Zachary Chase Lipton David C Kale Charles Elkan and Randall C Wetzel2015 Learning to Diagnose with LSTM Recurrent Neural Networks CoRRabs151103677 (2015) arXiv151103677 httparxivorgabs151103677

[17] Yisheng Lv Yanjie Duan Wenwen Kang Zhengxi Li and Fei-Yue Wang 2015Traffic flow prediction with big data a deep learning approach IEEE Transactionson Intelligent Transportation Systems 16 2 (2015) 865ndash873

[18] Nicholas G Polson and Vadim O Sokolov 2017 Deep learning for short-termtraffic flow prediction Transportation Research Part C Emerging Technologies 79(2017) 1ndash17

[19] Haşim Sak Andrew Senior and Franccediloise Beaufays 2014 Long short-term mem-ory based recurrent neural network architectures for large vocabulary speechrecognition arXiv preprint arXiv14021128 (2014)

[20] Ilya Sutskever Oriol Vinyals and Quoc V Le 2014 Sequence to sequence learningwith neural networks In Advances in neural information processing systems 3104ndash3112

[21] Oriol Vinyals Alexander Toshev Samy Bengio and Dumitru Erhan 2015 Showand tell A neural image caption generator In Computer Vision and Pattern Recog-nition (CVPR) 2015 IEEE Conference on IEEE 3156ndash3164

[22] Edward Waltz James Llinas et al 1990 Multisensor data fusion Vol 685 Artechhouse Boston

[23] Wojciech Zaremba Ilya Sutskever and Oriol Vinyals 2014 Recurrent neuralnetwork regularization arXiv preprint arXiv14092329 (2014)

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Data Collection Framework
    • 31 Traffic Data
    • 32 Weather Data
    • 33 Interest Point Selection for Data Collection
    • 34 Sample Data
    • 35 Traffic Analysis
      • 4 Deep Learning-Based Traffic Prediction Model
        • 41 Recurrent Neural Networks
        • 42 Long Short Term Memory (LSTM)
        • 43 Prediction Model
          • 5 Traffic Flow Prediction
            • 51 Data Representation
            • 52 Data Preprocessing
            • 53 Prediction Results
            • 54 Comparing results with a Multilayer Perceptron model
              • 6 Correlating deviations in traffic prediction with weather conditions
              • 7 Conclusion
              • References
Page 6: Deep Learning Based Urban Analytics Platform: Applications ... · Deep Learning Based Urban Analytics Platform: Applications to Traffic Flow Modeling and Prediction MUD3, Aug 2018,

MUD3 Aug 2018 London UK Archit et al

Figure 8 Architecture of the traffic prediction model The first layer in this architecture represents the input layer which contains thetraffic data from the current and previous time intervals This data is then fed into a stacked LSTM model of four layers which is connectedto a dense layer which then outputs the estimated traffic for the next time interval

a hyperparameter that can be set For this problem the length ofsequence is set to 12 That means we train the LSTM model on thepast hour of traffic data continuously and try to predict the trafficin the next 5-minute interval After this transformation the trainingobservations looks like this

x1 = [t1 t2 t12] y1 = [t13]

x2 = [t13 t14 t24] y2 = [t25]

Finally the training set will have the form

X train =

x1x2

xn

Y train =

y1y2

yn

The test data is also split into sequences of length 12 To predictthe 1st time interval of the 61st day we also add traffic informationfrom the previous day ie the last sequence of the 60th day to ourtest data

53 Prediction ResultsThe red line in Figure 9 represents the real traffic where as the blueline represents the traffic predictions made by our LSTM modeldepicted in Figure 8 The y-axis is the ETA in seconds and x-axis isthe time of the day

While the stacked LSTM architecture fits very well for makingtraffic predictions we observed that having a separate stackedLSTMmodel just for weekdays could improve the prediction resultsThis is based on our previous observations that the weekday traffic

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

Figure 9 Real traffic (red) vs predicted traffic (blue) whentrained on all the historic data (Mean Squared Error = 517)

Figure 10 Real traffic vs predicted traffic when trained onlyon weekdays (Mean Squared Error = 461)

pattern differs from theweekend traffic pattern so having a separatemodel would benefit (Figure 10)

Date WeekdayAverageMSE

WeekdayAverageMAPE

All dayAverageMSE

All dayAverageMAPE

051518 459 466 473 473051618 341 451 349 46051718 397 477 40 486051818 379 472 384 487Average 394 466 401 476

Table 4 Mean squared error and mean absolute percentageerror averaged over 10 training and test runs for weekdayand all day prediction models

From Table 4 we can conclude that the Weekday training modelperformed better than all days training model Figure 12 displaysthe average residue for the Weekday model vs the All day modelfor a sample route The green area represents the intervals wherethe average error for prediction was less when using the Weekdaymodel whereas the red area captures the interval when average

error for prediction was high when using the Weekday model Thefigure reaffirms the efficacy of weekday training model

Figure 11 Training modelsrsquo prediction error Green regionsindicate less prediction errors when model was trained only forweekdays and red regions indicate higher prediction errors

54 Comparing results with a MultilayerPerceptron model

The following results were obtained when we trained an artificialneural network with 3 hidden layers (288168) The inputs to themodel was the day of week(1-7) and the time interval (1-288) andthe output was traffic (estimated time of travel)

Figure 12 Real traffic vs Predicted traffic on 051518 ob-tained using a multilayer perceptron model trained on alldays

Date MSE MAPE051518 79 55051618 37 546051718 492 622051818 735 619Average 597 584

Table 5 Mean squared error and mean absolute percentageerrors obtained using multilayer perceptron model

MUD3 Aug 2018 London UK Archit et al

It can be observed that while the multilayer perceptron model isable to learn the general traffic pattern the LSTM model is able tobetter capture the traffic pattern more accurately with lower MSEand MAPE

6 CORRELATING DEVIATIONS IN TRAFFICPREDICTIONWITHWEATHERCONDITIONS

Traffic conditions are dynamic in nature There are many uncon-trollable factors which can cause deviations in traffic patterns Forexample a sudden thunderstorm an accident or a game event There-fore it is certain that an LSTM model can not make accurate predic-tions all the time considering that there are other factors in playwhich could cause sudden change in traffic conditions Here wefocus on one such factor ie weather and try to correlate and see itsinfluence on traffic conditions

Figure 13 and Figure 14 shows the changing weather conditionsat different time intervals For example it rained around 100-200AM and then there was thunderstorm followed by rain at 1200PM The figures also display the real traffic and the LSTM predictedtraffic In certain regions such as when there was a thunderstormthe deviation or the error in prediction was observed to be morethan normal ie when the sky was clear

Figure 13 Peak deviations in traffic correlated with weatherconditions The arrows point at a traffic condition at a par-ticular time and weather

Figure 14 Deviation in traffic prediction with weather Theorange region highlight the interval of thunderstorm and darkgreen region highlight the interval of heavy rain These two regionscorrespond to the highest deviation in traffic prediction

Intuitively and experimentally from Figure 14 it can be said thattraffic conditions are definitely affected by weather conditions Thisimpact of the anomalous weather on traffic can be measured usingLSTM predictions as observed in this study

7 CONCLUSIONIn this paper we have discussed a low-cost data acquisition andfusion model for traffic flow analysis To get high quality traffic dataresearchers [6 10] must rely on government sources for gettingdata from hardware devices (ie loop detectors GPS) In view ofthe large amount of data that was required for this project paidservices like Google maps are often not an option Thus acquiringhigh quality traffic data is one of the challenges in traffic predictionHowever utilizing the power of webscraping on a crowdsourcedwebsite ie Waze we could gather estimated traffic informationin the city OpenWeatherMap API allows us to retrieve relevantinformation regarding possible weather conditions for unexpectedtraffic flows

Our LSTM model has shown good performance for short termprediction it however lacks the capability for making long-termpredictions Utilizing the data fusion engine we aim to build amore sophisticated model that can not only make long term trafficpredictions but could also inform us about the anomalous traf-fic behavior using diverse data sources such as social networksnews headlines and accident logs This will require developmentof an efficient anomaly detection technique that can automate theinforming process

REFERENCES[1] [n d] Google Maps Distance Matrix API httpsdevelopersgooglecommaps

documentationdistance-matrixintro Accessed2018-05-29[2] [n d] OpenWeatherMap httpsopenweathermaporg Accessed2018-05-29[3] [n d] Waze httpswwwwazecom Accessed2018-05-29[4] Afshin Abadi Tooraj Rajabioun and Petros A Ioannou 2015 Traffic flow predic-

tion for road transportation networks with limited traffic data IEEE transactionson intelligent transportation systems 16 2 (2015) 653ndash662

[5] Michael Auli Michel Galley Chris Quirk and Geoffrey Zweig 2013 Joint lan-guage and translation modeling with recurrent neural networks 3 (2013)

[6] Chris Bachmann Baher Abdulhai Matthew J Roorda and Behzad Moshiri 2013A comparative assessment of multi-sensor data fusion techniques for freeway

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

traffic speed estimation using microsimulation modeling Transportation researchpart C emerging technologies 26 (2013) 33ndash48

[7] Zartasha Baloch Faisal Karim Shaikh and Mukhtiar Ali Unar 2018 A context-aware data fusion approach for health-IoT International Journal of InformationTechnology 10 3 (2018) 241ndash245

[8] Yuan-yuan Chen Yisheng Lv Zhenjiang Li and Fei-Yue Wang 2016 Long short-term memory model for traffic congestion prediction with online open dataIn 2016 IEEE 19th International Conference on Intelligent Transportation Systems(ITSC) IEEE 132ndash137

[9] Zhiyong Cui Ruimin Ke and Yinhai Wang 2018 Deep Bidirectional and Uni-directional LSTM Recurrent Neural Network for Network-wide Traffic SpeedPrediction arXiv preprint arXiv180102143 (2018)

[10] Nour-Eddin El Faouzi and Lawrence AKlein 2016 Data fusion for ITS techniquesand research needs Transportation Research Procedia 15 (2016) 495ndash512

[11] Yoav Goldberg 2016 A primer on neural network models for natural languageprocessing Journal of Artificial Intelligence Research 57 (2016) 345ndash420

[12] Jingrui He Wei Shen Phani Divakaruni Laura Wynter and Rick Lawrence2013 Improving Traffic Prediction with Tweet Semantics In Proceedings ofInternational Joint Conference on Artificial Intelligence (IJCAI) 1387ndash1393

[13] Sepp Hochreiter and Juumlrgen Schmidhuber 1997 Long short-termmemory Neuralcomputation 9 8 (1997) 1735ndash1780

[14] Andrej Karpathy and Li Fei-Fei 2015 Deep visual-semantic alignments forgenerating image descriptions In Proceedings of the IEEE conference on computervision and pattern recognition 3128ndash3137

[15] Xiangang Li and Xihong Wu 2015 Constructing long short-term memory baseddeep recurrent neural networks for large vocabulary speech recognition InAcous-tics Speech and Signal Processing (ICASSP) 2015 IEEE International Conference onIEEE 4520ndash4524

[16] Zachary Chase Lipton David C Kale Charles Elkan and Randall C Wetzel2015 Learning to Diagnose with LSTM Recurrent Neural Networks CoRRabs151103677 (2015) arXiv151103677 httparxivorgabs151103677

[17] Yisheng Lv Yanjie Duan Wenwen Kang Zhengxi Li and Fei-Yue Wang 2015Traffic flow prediction with big data a deep learning approach IEEE Transactionson Intelligent Transportation Systems 16 2 (2015) 865ndash873

[18] Nicholas G Polson and Vadim O Sokolov 2017 Deep learning for short-termtraffic flow prediction Transportation Research Part C Emerging Technologies 79(2017) 1ndash17

[19] Haşim Sak Andrew Senior and Franccediloise Beaufays 2014 Long short-term mem-ory based recurrent neural network architectures for large vocabulary speechrecognition arXiv preprint arXiv14021128 (2014)

[20] Ilya Sutskever Oriol Vinyals and Quoc V Le 2014 Sequence to sequence learningwith neural networks In Advances in neural information processing systems 3104ndash3112

[21] Oriol Vinyals Alexander Toshev Samy Bengio and Dumitru Erhan 2015 Showand tell A neural image caption generator In Computer Vision and Pattern Recog-nition (CVPR) 2015 IEEE Conference on IEEE 3156ndash3164

[22] Edward Waltz James Llinas et al 1990 Multisensor data fusion Vol 685 Artechhouse Boston

[23] Wojciech Zaremba Ilya Sutskever and Oriol Vinyals 2014 Recurrent neuralnetwork regularization arXiv preprint arXiv14092329 (2014)

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Data Collection Framework
    • 31 Traffic Data
    • 32 Weather Data
    • 33 Interest Point Selection for Data Collection
    • 34 Sample Data
    • 35 Traffic Analysis
      • 4 Deep Learning-Based Traffic Prediction Model
        • 41 Recurrent Neural Networks
        • 42 Long Short Term Memory (LSTM)
        • 43 Prediction Model
          • 5 Traffic Flow Prediction
            • 51 Data Representation
            • 52 Data Preprocessing
            • 53 Prediction Results
            • 54 Comparing results with a Multilayer Perceptron model
              • 6 Correlating deviations in traffic prediction with weather conditions
              • 7 Conclusion
              • References
Page 7: Deep Learning Based Urban Analytics Platform: Applications ... · Deep Learning Based Urban Analytics Platform: Applications to Traffic Flow Modeling and Prediction MUD3, Aug 2018,

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

Figure 9 Real traffic (red) vs predicted traffic (blue) whentrained on all the historic data (Mean Squared Error = 517)

Figure 10 Real traffic vs predicted traffic when trained onlyon weekdays (Mean Squared Error = 461)

pattern differs from theweekend traffic pattern so having a separatemodel would benefit (Figure 10)

Date WeekdayAverageMSE

WeekdayAverageMAPE

All dayAverageMSE

All dayAverageMAPE

051518 459 466 473 473051618 341 451 349 46051718 397 477 40 486051818 379 472 384 487Average 394 466 401 476

Table 4 Mean squared error and mean absolute percentageerror averaged over 10 training and test runs for weekdayand all day prediction models

From Table 4 we can conclude that the Weekday training modelperformed better than all days training model Figure 12 displaysthe average residue for the Weekday model vs the All day modelfor a sample route The green area represents the intervals wherethe average error for prediction was less when using the Weekdaymodel whereas the red area captures the interval when average

error for prediction was high when using the Weekday model Thefigure reaffirms the efficacy of weekday training model

Figure 11 Training modelsrsquo prediction error Green regionsindicate less prediction errors when model was trained only forweekdays and red regions indicate higher prediction errors

54 Comparing results with a MultilayerPerceptron model

The following results were obtained when we trained an artificialneural network with 3 hidden layers (288168) The inputs to themodel was the day of week(1-7) and the time interval (1-288) andthe output was traffic (estimated time of travel)

Figure 12 Real traffic vs Predicted traffic on 051518 ob-tained using a multilayer perceptron model trained on alldays

Date MSE MAPE051518 79 55051618 37 546051718 492 622051818 735 619Average 597 584

Table 5 Mean squared error and mean absolute percentageerrors obtained using multilayer perceptron model

MUD3 Aug 2018 London UK Archit et al

It can be observed that while the multilayer perceptron model isable to learn the general traffic pattern the LSTM model is able tobetter capture the traffic pattern more accurately with lower MSEand MAPE

6 CORRELATING DEVIATIONS IN TRAFFICPREDICTIONWITHWEATHERCONDITIONS

Traffic conditions are dynamic in nature There are many uncon-trollable factors which can cause deviations in traffic patterns Forexample a sudden thunderstorm an accident or a game event There-fore it is certain that an LSTM model can not make accurate predic-tions all the time considering that there are other factors in playwhich could cause sudden change in traffic conditions Here wefocus on one such factor ie weather and try to correlate and see itsinfluence on traffic conditions

Figure 13 and Figure 14 shows the changing weather conditionsat different time intervals For example it rained around 100-200AM and then there was thunderstorm followed by rain at 1200PM The figures also display the real traffic and the LSTM predictedtraffic In certain regions such as when there was a thunderstormthe deviation or the error in prediction was observed to be morethan normal ie when the sky was clear

Figure 13 Peak deviations in traffic correlated with weatherconditions The arrows point at a traffic condition at a par-ticular time and weather

Figure 14 Deviation in traffic prediction with weather Theorange region highlight the interval of thunderstorm and darkgreen region highlight the interval of heavy rain These two regionscorrespond to the highest deviation in traffic prediction

Intuitively and experimentally from Figure 14 it can be said thattraffic conditions are definitely affected by weather conditions Thisimpact of the anomalous weather on traffic can be measured usingLSTM predictions as observed in this study

7 CONCLUSIONIn this paper we have discussed a low-cost data acquisition andfusion model for traffic flow analysis To get high quality traffic dataresearchers [6 10] must rely on government sources for gettingdata from hardware devices (ie loop detectors GPS) In view ofthe large amount of data that was required for this project paidservices like Google maps are often not an option Thus acquiringhigh quality traffic data is one of the challenges in traffic predictionHowever utilizing the power of webscraping on a crowdsourcedwebsite ie Waze we could gather estimated traffic informationin the city OpenWeatherMap API allows us to retrieve relevantinformation regarding possible weather conditions for unexpectedtraffic flows

Our LSTM model has shown good performance for short termprediction it however lacks the capability for making long-termpredictions Utilizing the data fusion engine we aim to build amore sophisticated model that can not only make long term trafficpredictions but could also inform us about the anomalous traf-fic behavior using diverse data sources such as social networksnews headlines and accident logs This will require developmentof an efficient anomaly detection technique that can automate theinforming process

REFERENCES[1] [n d] Google Maps Distance Matrix API httpsdevelopersgooglecommaps

documentationdistance-matrixintro Accessed2018-05-29[2] [n d] OpenWeatherMap httpsopenweathermaporg Accessed2018-05-29[3] [n d] Waze httpswwwwazecom Accessed2018-05-29[4] Afshin Abadi Tooraj Rajabioun and Petros A Ioannou 2015 Traffic flow predic-

tion for road transportation networks with limited traffic data IEEE transactionson intelligent transportation systems 16 2 (2015) 653ndash662

[5] Michael Auli Michel Galley Chris Quirk and Geoffrey Zweig 2013 Joint lan-guage and translation modeling with recurrent neural networks 3 (2013)

[6] Chris Bachmann Baher Abdulhai Matthew J Roorda and Behzad Moshiri 2013A comparative assessment of multi-sensor data fusion techniques for freeway

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

traffic speed estimation using microsimulation modeling Transportation researchpart C emerging technologies 26 (2013) 33ndash48

[7] Zartasha Baloch Faisal Karim Shaikh and Mukhtiar Ali Unar 2018 A context-aware data fusion approach for health-IoT International Journal of InformationTechnology 10 3 (2018) 241ndash245

[8] Yuan-yuan Chen Yisheng Lv Zhenjiang Li and Fei-Yue Wang 2016 Long short-term memory model for traffic congestion prediction with online open dataIn 2016 IEEE 19th International Conference on Intelligent Transportation Systems(ITSC) IEEE 132ndash137

[9] Zhiyong Cui Ruimin Ke and Yinhai Wang 2018 Deep Bidirectional and Uni-directional LSTM Recurrent Neural Network for Network-wide Traffic SpeedPrediction arXiv preprint arXiv180102143 (2018)

[10] Nour-Eddin El Faouzi and Lawrence AKlein 2016 Data fusion for ITS techniquesand research needs Transportation Research Procedia 15 (2016) 495ndash512

[11] Yoav Goldberg 2016 A primer on neural network models for natural languageprocessing Journal of Artificial Intelligence Research 57 (2016) 345ndash420

[12] Jingrui He Wei Shen Phani Divakaruni Laura Wynter and Rick Lawrence2013 Improving Traffic Prediction with Tweet Semantics In Proceedings ofInternational Joint Conference on Artificial Intelligence (IJCAI) 1387ndash1393

[13] Sepp Hochreiter and Juumlrgen Schmidhuber 1997 Long short-termmemory Neuralcomputation 9 8 (1997) 1735ndash1780

[14] Andrej Karpathy and Li Fei-Fei 2015 Deep visual-semantic alignments forgenerating image descriptions In Proceedings of the IEEE conference on computervision and pattern recognition 3128ndash3137

[15] Xiangang Li and Xihong Wu 2015 Constructing long short-term memory baseddeep recurrent neural networks for large vocabulary speech recognition InAcous-tics Speech and Signal Processing (ICASSP) 2015 IEEE International Conference onIEEE 4520ndash4524

[16] Zachary Chase Lipton David C Kale Charles Elkan and Randall C Wetzel2015 Learning to Diagnose with LSTM Recurrent Neural Networks CoRRabs151103677 (2015) arXiv151103677 httparxivorgabs151103677

[17] Yisheng Lv Yanjie Duan Wenwen Kang Zhengxi Li and Fei-Yue Wang 2015Traffic flow prediction with big data a deep learning approach IEEE Transactionson Intelligent Transportation Systems 16 2 (2015) 865ndash873

[18] Nicholas G Polson and Vadim O Sokolov 2017 Deep learning for short-termtraffic flow prediction Transportation Research Part C Emerging Technologies 79(2017) 1ndash17

[19] Haşim Sak Andrew Senior and Franccediloise Beaufays 2014 Long short-term mem-ory based recurrent neural network architectures for large vocabulary speechrecognition arXiv preprint arXiv14021128 (2014)

[20] Ilya Sutskever Oriol Vinyals and Quoc V Le 2014 Sequence to sequence learningwith neural networks In Advances in neural information processing systems 3104ndash3112

[21] Oriol Vinyals Alexander Toshev Samy Bengio and Dumitru Erhan 2015 Showand tell A neural image caption generator In Computer Vision and Pattern Recog-nition (CVPR) 2015 IEEE Conference on IEEE 3156ndash3164

[22] Edward Waltz James Llinas et al 1990 Multisensor data fusion Vol 685 Artechhouse Boston

[23] Wojciech Zaremba Ilya Sutskever and Oriol Vinyals 2014 Recurrent neuralnetwork regularization arXiv preprint arXiv14092329 (2014)

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Data Collection Framework
    • 31 Traffic Data
    • 32 Weather Data
    • 33 Interest Point Selection for Data Collection
    • 34 Sample Data
    • 35 Traffic Analysis
      • 4 Deep Learning-Based Traffic Prediction Model
        • 41 Recurrent Neural Networks
        • 42 Long Short Term Memory (LSTM)
        • 43 Prediction Model
          • 5 Traffic Flow Prediction
            • 51 Data Representation
            • 52 Data Preprocessing
            • 53 Prediction Results
            • 54 Comparing results with a Multilayer Perceptron model
              • 6 Correlating deviations in traffic prediction with weather conditions
              • 7 Conclusion
              • References
Page 8: Deep Learning Based Urban Analytics Platform: Applications ... · Deep Learning Based Urban Analytics Platform: Applications to Traffic Flow Modeling and Prediction MUD3, Aug 2018,

MUD3 Aug 2018 London UK Archit et al

It can be observed that while the multilayer perceptron model isable to learn the general traffic pattern the LSTM model is able tobetter capture the traffic pattern more accurately with lower MSEand MAPE

6 CORRELATING DEVIATIONS IN TRAFFICPREDICTIONWITHWEATHERCONDITIONS

Traffic conditions are dynamic in nature There are many uncon-trollable factors which can cause deviations in traffic patterns Forexample a sudden thunderstorm an accident or a game event There-fore it is certain that an LSTM model can not make accurate predic-tions all the time considering that there are other factors in playwhich could cause sudden change in traffic conditions Here wefocus on one such factor ie weather and try to correlate and see itsinfluence on traffic conditions

Figure 13 and Figure 14 shows the changing weather conditionsat different time intervals For example it rained around 100-200AM and then there was thunderstorm followed by rain at 1200PM The figures also display the real traffic and the LSTM predictedtraffic In certain regions such as when there was a thunderstormthe deviation or the error in prediction was observed to be morethan normal ie when the sky was clear

Figure 13 Peak deviations in traffic correlated with weatherconditions The arrows point at a traffic condition at a par-ticular time and weather

Figure 14 Deviation in traffic prediction with weather Theorange region highlight the interval of thunderstorm and darkgreen region highlight the interval of heavy rain These two regionscorrespond to the highest deviation in traffic prediction

Intuitively and experimentally from Figure 14 it can be said thattraffic conditions are definitely affected by weather conditions Thisimpact of the anomalous weather on traffic can be measured usingLSTM predictions as observed in this study

7 CONCLUSIONIn this paper we have discussed a low-cost data acquisition andfusion model for traffic flow analysis To get high quality traffic dataresearchers [6 10] must rely on government sources for gettingdata from hardware devices (ie loop detectors GPS) In view ofthe large amount of data that was required for this project paidservices like Google maps are often not an option Thus acquiringhigh quality traffic data is one of the challenges in traffic predictionHowever utilizing the power of webscraping on a crowdsourcedwebsite ie Waze we could gather estimated traffic informationin the city OpenWeatherMap API allows us to retrieve relevantinformation regarding possible weather conditions for unexpectedtraffic flows

Our LSTM model has shown good performance for short termprediction it however lacks the capability for making long-termpredictions Utilizing the data fusion engine we aim to build amore sophisticated model that can not only make long term trafficpredictions but could also inform us about the anomalous traf-fic behavior using diverse data sources such as social networksnews headlines and accident logs This will require developmentof an efficient anomaly detection technique that can automate theinforming process

REFERENCES[1] [n d] Google Maps Distance Matrix API httpsdevelopersgooglecommaps

documentationdistance-matrixintro Accessed2018-05-29[2] [n d] OpenWeatherMap httpsopenweathermaporg Accessed2018-05-29[3] [n d] Waze httpswwwwazecom Accessed2018-05-29[4] Afshin Abadi Tooraj Rajabioun and Petros A Ioannou 2015 Traffic flow predic-

tion for road transportation networks with limited traffic data IEEE transactionson intelligent transportation systems 16 2 (2015) 653ndash662

[5] Michael Auli Michel Galley Chris Quirk and Geoffrey Zweig 2013 Joint lan-guage and translation modeling with recurrent neural networks 3 (2013)

[6] Chris Bachmann Baher Abdulhai Matthew J Roorda and Behzad Moshiri 2013A comparative assessment of multi-sensor data fusion techniques for freeway

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

traffic speed estimation using microsimulation modeling Transportation researchpart C emerging technologies 26 (2013) 33ndash48

[7] Zartasha Baloch Faisal Karim Shaikh and Mukhtiar Ali Unar 2018 A context-aware data fusion approach for health-IoT International Journal of InformationTechnology 10 3 (2018) 241ndash245

[8] Yuan-yuan Chen Yisheng Lv Zhenjiang Li and Fei-Yue Wang 2016 Long short-term memory model for traffic congestion prediction with online open dataIn 2016 IEEE 19th International Conference on Intelligent Transportation Systems(ITSC) IEEE 132ndash137

[9] Zhiyong Cui Ruimin Ke and Yinhai Wang 2018 Deep Bidirectional and Uni-directional LSTM Recurrent Neural Network for Network-wide Traffic SpeedPrediction arXiv preprint arXiv180102143 (2018)

[10] Nour-Eddin El Faouzi and Lawrence AKlein 2016 Data fusion for ITS techniquesand research needs Transportation Research Procedia 15 (2016) 495ndash512

[11] Yoav Goldberg 2016 A primer on neural network models for natural languageprocessing Journal of Artificial Intelligence Research 57 (2016) 345ndash420

[12] Jingrui He Wei Shen Phani Divakaruni Laura Wynter and Rick Lawrence2013 Improving Traffic Prediction with Tweet Semantics In Proceedings ofInternational Joint Conference on Artificial Intelligence (IJCAI) 1387ndash1393

[13] Sepp Hochreiter and Juumlrgen Schmidhuber 1997 Long short-termmemory Neuralcomputation 9 8 (1997) 1735ndash1780

[14] Andrej Karpathy and Li Fei-Fei 2015 Deep visual-semantic alignments forgenerating image descriptions In Proceedings of the IEEE conference on computervision and pattern recognition 3128ndash3137

[15] Xiangang Li and Xihong Wu 2015 Constructing long short-term memory baseddeep recurrent neural networks for large vocabulary speech recognition InAcous-tics Speech and Signal Processing (ICASSP) 2015 IEEE International Conference onIEEE 4520ndash4524

[16] Zachary Chase Lipton David C Kale Charles Elkan and Randall C Wetzel2015 Learning to Diagnose with LSTM Recurrent Neural Networks CoRRabs151103677 (2015) arXiv151103677 httparxivorgabs151103677

[17] Yisheng Lv Yanjie Duan Wenwen Kang Zhengxi Li and Fei-Yue Wang 2015Traffic flow prediction with big data a deep learning approach IEEE Transactionson Intelligent Transportation Systems 16 2 (2015) 865ndash873

[18] Nicholas G Polson and Vadim O Sokolov 2017 Deep learning for short-termtraffic flow prediction Transportation Research Part C Emerging Technologies 79(2017) 1ndash17

[19] Haşim Sak Andrew Senior and Franccediloise Beaufays 2014 Long short-term mem-ory based recurrent neural network architectures for large vocabulary speechrecognition arXiv preprint arXiv14021128 (2014)

[20] Ilya Sutskever Oriol Vinyals and Quoc V Le 2014 Sequence to sequence learningwith neural networks In Advances in neural information processing systems 3104ndash3112

[21] Oriol Vinyals Alexander Toshev Samy Bengio and Dumitru Erhan 2015 Showand tell A neural image caption generator In Computer Vision and Pattern Recog-nition (CVPR) 2015 IEEE Conference on IEEE 3156ndash3164

[22] Edward Waltz James Llinas et al 1990 Multisensor data fusion Vol 685 Artechhouse Boston

[23] Wojciech Zaremba Ilya Sutskever and Oriol Vinyals 2014 Recurrent neuralnetwork regularization arXiv preprint arXiv14092329 (2014)

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Data Collection Framework
    • 31 Traffic Data
    • 32 Weather Data
    • 33 Interest Point Selection for Data Collection
    • 34 Sample Data
    • 35 Traffic Analysis
      • 4 Deep Learning-Based Traffic Prediction Model
        • 41 Recurrent Neural Networks
        • 42 Long Short Term Memory (LSTM)
        • 43 Prediction Model
          • 5 Traffic Flow Prediction
            • 51 Data Representation
            • 52 Data Preprocessing
            • 53 Prediction Results
            • 54 Comparing results with a Multilayer Perceptron model
              • 6 Correlating deviations in traffic prediction with weather conditions
              • 7 Conclusion
              • References
Page 9: Deep Learning Based Urban Analytics Platform: Applications ... · Deep Learning Based Urban Analytics Platform: Applications to Traffic Flow Modeling and Prediction MUD3, Aug 2018,

Deep Learning Based Urban Analytics PlatformApplications to Traffic Flow Modeling and Prediction MUD3 Aug 2018 London UK

traffic speed estimation using microsimulation modeling Transportation researchpart C emerging technologies 26 (2013) 33ndash48

[7] Zartasha Baloch Faisal Karim Shaikh and Mukhtiar Ali Unar 2018 A context-aware data fusion approach for health-IoT International Journal of InformationTechnology 10 3 (2018) 241ndash245

[8] Yuan-yuan Chen Yisheng Lv Zhenjiang Li and Fei-Yue Wang 2016 Long short-term memory model for traffic congestion prediction with online open dataIn 2016 IEEE 19th International Conference on Intelligent Transportation Systems(ITSC) IEEE 132ndash137

[9] Zhiyong Cui Ruimin Ke and Yinhai Wang 2018 Deep Bidirectional and Uni-directional LSTM Recurrent Neural Network for Network-wide Traffic SpeedPrediction arXiv preprint arXiv180102143 (2018)

[10] Nour-Eddin El Faouzi and Lawrence AKlein 2016 Data fusion for ITS techniquesand research needs Transportation Research Procedia 15 (2016) 495ndash512

[11] Yoav Goldberg 2016 A primer on neural network models for natural languageprocessing Journal of Artificial Intelligence Research 57 (2016) 345ndash420

[12] Jingrui He Wei Shen Phani Divakaruni Laura Wynter and Rick Lawrence2013 Improving Traffic Prediction with Tweet Semantics In Proceedings ofInternational Joint Conference on Artificial Intelligence (IJCAI) 1387ndash1393

[13] Sepp Hochreiter and Juumlrgen Schmidhuber 1997 Long short-termmemory Neuralcomputation 9 8 (1997) 1735ndash1780

[14] Andrej Karpathy and Li Fei-Fei 2015 Deep visual-semantic alignments forgenerating image descriptions In Proceedings of the IEEE conference on computervision and pattern recognition 3128ndash3137

[15] Xiangang Li and Xihong Wu 2015 Constructing long short-term memory baseddeep recurrent neural networks for large vocabulary speech recognition InAcous-tics Speech and Signal Processing (ICASSP) 2015 IEEE International Conference onIEEE 4520ndash4524

[16] Zachary Chase Lipton David C Kale Charles Elkan and Randall C Wetzel2015 Learning to Diagnose with LSTM Recurrent Neural Networks CoRRabs151103677 (2015) arXiv151103677 httparxivorgabs151103677

[17] Yisheng Lv Yanjie Duan Wenwen Kang Zhengxi Li and Fei-Yue Wang 2015Traffic flow prediction with big data a deep learning approach IEEE Transactionson Intelligent Transportation Systems 16 2 (2015) 865ndash873

[18] Nicholas G Polson and Vadim O Sokolov 2017 Deep learning for short-termtraffic flow prediction Transportation Research Part C Emerging Technologies 79(2017) 1ndash17

[19] Haşim Sak Andrew Senior and Franccediloise Beaufays 2014 Long short-term mem-ory based recurrent neural network architectures for large vocabulary speechrecognition arXiv preprint arXiv14021128 (2014)

[20] Ilya Sutskever Oriol Vinyals and Quoc V Le 2014 Sequence to sequence learningwith neural networks In Advances in neural information processing systems 3104ndash3112

[21] Oriol Vinyals Alexander Toshev Samy Bengio and Dumitru Erhan 2015 Showand tell A neural image caption generator In Computer Vision and Pattern Recog-nition (CVPR) 2015 IEEE Conference on IEEE 3156ndash3164

[22] Edward Waltz James Llinas et al 1990 Multisensor data fusion Vol 685 Artechhouse Boston

[23] Wojciech Zaremba Ilya Sutskever and Oriol Vinyals 2014 Recurrent neuralnetwork regularization arXiv preprint arXiv14092329 (2014)

  • Abstract
  • 1 Introduction
  • 2 Related Work
  • 3 Data Collection Framework
    • 31 Traffic Data
    • 32 Weather Data
    • 33 Interest Point Selection for Data Collection
    • 34 Sample Data
    • 35 Traffic Analysis
      • 4 Deep Learning-Based Traffic Prediction Model
        • 41 Recurrent Neural Networks
        • 42 Long Short Term Memory (LSTM)
        • 43 Prediction Model
          • 5 Traffic Flow Prediction
            • 51 Data Representation
            • 52 Data Preprocessing
            • 53 Prediction Results
            • 54 Comparing results with a Multilayer Perceptron model
              • 6 Correlating deviations in traffic prediction with weather conditions
              • 7 Conclusion
              • References