USING ARCHIVED DATA TO SYSTEMATICALLY IDENTIFY AND...

15
1 USING ARCHIVED DATA TO SYSTEMATICALLY IDENTIFY AND PRIORITIZE FREEWAY BOTTLENECKS Robert L. Bertini 1 , Huan Li 2 , Jerzy Wieczorek 3 and Rafael J. Fernández-Moctezuma 4 ABSTRACT. Bottlenecks are critical components of freeway systems and are of increasing importance as congestion worsens in major urban areas. In the U.S., the Federal Highway Administration has launched a new initiative aimed at identifying key bottlenecks in each state. In Oregon, a freeway data archive has been established that stores vehicular count, occupancy, and speed data from over 600 locations at 20-second intervals. Performance measurement tools are also available and report on travel time, delay, vehicle miles traveled and vehicle hours traveled. In addition, basic statistical tools are included that can report on travel reliability across days, weeks, months and years. The Portland, Oregon metropolitan region is known for its innovative transportation planning and multimodal transportation system. This paper describes procedures used to develop an automated system to identify freeway bottlenecks using historical data within an archived data user service environment. Efforts are aimed at identification and display of active bottleneck features using graphical tools. This research aims to detect bottleneck activation in real time and to expand the use of reliability techniques. Efforts are aimed at improving the prioritization of improvements and implementation of operational strategies. With the availability of archived data, new analysis tools are possible at relatively low cost, with no new data collection. INTRODUCTION Understanding traffic behavior at freeway bottleneck provides a foundation for understanding how the freeway system operates. A bottleneck is any point on the network upstream of which one finds a queue and downstream of which one finds freely flowing traffic. Bottlenecks can 1 Associate Professor, Department of Civil and Environmental Engineering, Portland State University, Portland, Oregon, USA, email: [email protected] 2 M.S. Student, Department of Civil and Environmental Engineering, Portland State University, Portland, Oregon, USA, email: [email protected] 3 M.S. Student, Department of Mathematics and Statistics, Portland State University, Portland, Oregon, USA, email: [email protected] 4 Ph.D. Student, Department of Computer Science, Portland State University, Portland, Oregon, USA, email: [email protected]

Transcript of USING ARCHIVED DATA TO SYSTEMATICALLY IDENTIFY AND...

Page 1: USING ARCHIVED DATA TO SYSTEMATICALLY IDENTIFY AND ...bertini.eng.usf.edu/papers/bertini_aatt.pdf · bottleneck identification and analysis capabilities. One of PORTAL’s main features

1

USING ARCHIVED DATA TO SYSTEMATICALLY IDENTIFY AND

PRIORITIZE FREEWAY BOTTLENECKS

Robert L. Bertini1, Huan Li

2, Jerzy Wieczorek

3 and Rafael J. Fernández-Moctezuma

4

ABSTRACT. Bottlenecks are critical components of freeway systems and are of increasing

importance as congestion worsens in major urban areas. In the U.S., the Federal Highway

Administration has launched a new initiative aimed at identifying key bottlenecks in each

state. In Oregon, a freeway data archive has been established that stores vehicular count,

occupancy, and speed data from over 600 locations at 20-second intervals. Performance

measurement tools are also available and report on travel time, delay, vehicle miles traveled

and vehicle hours traveled. In addition, basic statistical tools are included that can report on

travel reliability across days, weeks, months and years. The Portland, Oregon metropolitan

region is known for its innovative transportation planning and multimodal transportation

system. This paper describes procedures used to develop an automated system to identify

freeway bottlenecks using historical data within an archived data user service environment.

Efforts are aimed at identification and display of active bottleneck features using graphical

tools. This research aims to detect bottleneck activation in real time and to expand the use of

reliability techniques. Efforts are aimed at improving the prioritization of improvements and

implementation of operational strategies. With the availability of archived data, new analysis

tools are possible at relatively low cost, with no new data collection.

INTRODUCTION

Understanding traffic behavior at freeway bottleneck provides a foundation for understanding

how the freeway system operates. A bottleneck is any point on the network upstream of which

one finds a queue and downstream of which one finds freely flowing traffic. Bottlenecks can

1 Associate Professor, Department of Civil and Environmental Engineering, Portland State University, Portland,

Oregon, USA, email: [email protected] 2 M.S. Student, Department of Civil and Environmental Engineering, Portland State University, Portland,

Oregon, USA, email: [email protected] 3 M.S. Student, Department of Mathematics and Statistics, Portland State University, Portland, Oregon, USA,

email: [email protected] 4 Ph.D. Student, Department of Computer Science, Portland State University, Portland, Oregon, USA, email:

[email protected]

Page 2: USING ARCHIVED DATA TO SYSTEMATICALLY IDENTIFY AND ...bertini.eng.usf.edu/papers/bertini_aatt.pdf · bottleneck identification and analysis capabilities. One of PORTAL’s main features

2

be static (e.g. a tunnel entrance) or dynamic (e.g., an incident or a slow moving vehicle). Also

sometimes a bottleneck can be active (meeting the conditions described above) and

sometimes it can be deactivated due to a decrease in demand or due to spillover from a

downstream bottleneck (Daganzo 1997).

Freeway congestion imposes significant costs on the movement of people and freight, and

new initiatives to measure freeway performance are being undertaken. In Oregon, toward

improving the management and operation of the transportation system as well as improving

transportation planning methods, a freeway data archive has been established that stores

vehicular count, occupancy, and speed data from over 600 locations at 20-second intervals.

Via a web interface, performance measurement tools are available and report on travel time,

delay, vehicle miles traveled and vehicle hours traveled. In addition, basic statistical tools are

included that can report on travel reliability across days, weeks, months and years. The

objective of this paper is to describe steps taken toward incorporating an automated system to

identify freeway bottlenecks using historical data within the Portland Oregon Regional

Transportation Archive Listing (PORTAL, see http://portal.its.pdx.edu) environment. Using

baseline knowledge of when and where bottlenecks occurred, a case study is described that

aims at identification and display of active bottleneck features using graphical tools. Previous

research conducted in California is validated for use in PORTAL, and suggestions for further

research are provided. With the availability of a rich source of archived data, new analytical

tools can be developed for use with historical data and in a real-time environment informed

by past performance.

BACKGROUND

Past research has sought a greater understanding of where freeway bottlenecks formed and

how and when they were activated, using oblique curves of cumulative vehicle arrival number

versus time and cumulative occupancy (or speed) versus time constructed from data measured

at neighboring freeway loop detectors. Once suitably transformed, these cumulative curves

provided the measurement resolution necessary to document and observe the transitions

between freely-flowing and queued conditions and to identify some important traffic features

(Cassidy and Windover 1995; Cassidy and Bertini 1999a, 1999b; Bertini and Myton, 2005;

Horowitz and Bertini, 2007). This same methodology has been applied here to systematically

define bottleneck activations and deactivations as a baseline for testing and implementing an

automated bottleneck detection algorithm.

Previous research has also developed techniques for automating and/or expanding the

identification of bottleneck activations and deactivations using comparisons of measured

traffic parameters and applying thresholds for those values. Most notably, Chen et al. (2004)

developed an algorithm that identified bottleneck locations and their activation and

deactivation times. That study used 5-minute inductive loop detector data from freeways in

San Diego, California, and focused on speed difference between adjacent detectors. Zhang

and Levinson (2004) also implemented a similar system to identify bottlenecks using

occupancy differentials. Banks (1991) used 30-second speed drop thresholds to identify

bottleneck activation in San Diego, and Hall and Agyemang-Duah (1991) developed a

threshold based on the ratio of 30-second occupancy divided by flow on a Canadian freeway.

Bertini (2003) tested several other signals including the variance of 30-second count, as

characterizing bottleneck activation at several sites. Building on this past research, the intent

in this paper is to test the Chen et al. method using a sensitivity analysis approach with data

from a freeway in Portland, Oregon.

Page 3: USING ARCHIVED DATA TO SYSTEMATICALLY IDENTIFY AND ...bertini.eng.usf.edu/papers/bertini_aatt.pdf · bottleneck identification and analysis capabilities. One of PORTAL’s main features

3

EXTENDING PORTAL’S FUNCTIONALITY

PORTAL has been in operation since July 2004. The regional archive has expanded its

functionality through time; In addition to providing loop detector data at different aggregation

levels, automatic data analysis tools have been developed and included into the archive

interface. The tools include the ability to produce oblique plots, travel time analysis,

congestion reports, and information layers in time-space surface plots with incident and

variable message sign information.

In response to the recent FHWA initiative of identifying bottleneck locations, the ITS Lab at

Portland State has identified and defined needs for a new PORTAL module that will provide

bottleneck identification and analysis capabilities. One of PORTAL’s main features is the use

of surface plots to represent the time-space evolution of a particular measurement. A design

goal for the bottleneck identification module is to visually represent the bottleneck activation

and deactivation times. A desired output is presented in Figure 1. Notice how the bottleneck

activation is marked as well as derived data, such as the average propagation speed of the

shockwave. The functionality of this tool will enable the study of previously observed

measurements with a single click, providing the most relevant information to the user:

activation, duration, deactivation, location, and shock propagation speed.

Figure 1. Desired output of the bottleneck analysis on historical data module for PORTAL

The next step is to incorporate acquired knowledge through historical analysis into live data

displays. Figure 2 illustrates an example of how previously known bottleneck locations can be

displayed on the time-space diagram in a real time environment. Bottlenecks A, B, and C

have been detected and mapped on the plot, and the boxes illustrate durations and locations

that occurred 90% of the time (for example) over a past period of time. These durations and

locations can be learned by analyzing data from previous archived days, for example,

analyzing the fifty previous non-holiday Wednesdays and characterize recurring features for a

B

A

Bottleneck

Activation

Deactivation

Estimated

Propagation Speed

A – 25 mph

B – 21 mph

Page 4: USING ARCHIVED DATA TO SYSTEMATICALLY IDENTIFY AND ...bertini.eng.usf.edu/papers/bertini_aatt.pdf · bottleneck identification and analysis capabilities. One of PORTAL’s main features

4

Wednesday display. The example in Figure 2 illustrates the time-space characteristics of two

known bottlenecks and provides a visual aid (bounding box). We acknowledge that detection

mechanism parameters may skip a detection, as with bottleneck A. In this case, displaying the

additional layer of ―learned‖ bottleneck locations provides the user with current analysis

output and historical knowledge, all in a single plot. Also notice how historical knowledge of

when and where to expect bottlenecks may be used to forecast propagation speeds as new

data arrives. As PORTAL continues to evolve, bottleneck information may also be displayed

in combination with incident information, providing a rich tool for users who may want to

distinguish recurrent congestion from incident related congestion.

Figure 2. Desired output of incoming data analysis in combination with historically learned

parameters, such as expected bottleneck locations and activations

Having defined design objectives for what the bottleneck module of PORTAL should

provide, one can identify as a key requirement the need for the identification mechanism to be

automatic and require no user intervention. This assumes the particular mechanism has been

calibrated a priori by the PORTAL staff and should be ready to use. In the following sections,

a discussion of existing techniques is presented, as well as an experimental analysis of

parameter fitting for a candidate method suitable to be implemented into PORTAL.

ANALYTICAL FRAMEWORK

The algorithm described in Chen et al. used the presence of a sustained speed gradient

between a pair of upstream-downstream detectors to identify bottlenecks. The most obvious

signals for identifying bottleneck in this method are the maximum speed measured by

upstream detector and speed differential between each pair of detectors. This algorithm,

which was simple and easy to be manipulated, was applied in San Diego using 40 mph

maximum upstream speed and 20 mph speed differential with 5 min speed data. However, the

values of the thresholds may be different in other cities like Portland because of different

A

B

C

Bottleneck

Estimated

Propagation Speed

A – 25 mph

B – 22 mph

C – 21 mph

Activation

Deactivation

90% percentile

of historicalbottlenecks

Page 5: USING ARCHIVED DATA TO SYSTEMATICALLY IDENTIFY AND ...bertini.eng.usf.edu/papers/bertini_aatt.pdf · bottleneck identification and analysis capabilities. One of PORTAL’s main features

5

situations in terms of the freeway network, geometry, weather, pavement, traffic conditions

and the nature of the available data. In this paper, the speed thresholds and data aggregation

levels comprising this algorithm are tested for bottleneck identification in Portland using

PORTAL data. In addition, real bottleneck locations and activation/deactivation times

identified using the method described in Cassidy and Bertini (1999) were used as baseline

reference points for comparing the outcome of the Chen et al. method using different data and

a range parameters. Specifically, we have tested the thresholds for maximum upstream speed

and speed differential; and evaluated different data aggregation intervals using PORTAL data.

The results are described in the next sections.

SITE DESCRIPTION AND DATA PREPARATION

The implementation of the Chen et al. method was tested with archived data from the

northbound Interstate 5 (I-5) corridor in the Portland, Oregon metropolitan region, shown in

Figure 3. There are 23 detector stations along this 24-mile corridor. The average detector

spacing is 1.09 miles and the standard deviation is 0.72 miles.

Figure 3. Interstate 5 corridor in the Portland region

MP 290.54 Lower Boones Ferry

MP 296.26 Spring Garden

MP 299.7 Terwilliger Blvd.

Portland

MP 310

MP 284

MP 307.9 Jantzen Beach

Page 6: USING ARCHIVED DATA TO SYSTEMATICALLY IDENTIFY AND ...bertini.eng.usf.edu/papers/bertini_aatt.pdf · bottleneck identification and analysis capabilities. One of PORTAL’s main features

6

Data from Wednesday October 3, 2007 was extracted from the PORTAL ADUS at the lowest

available resolution of 20 seconds, including count, occupancy and time mean speed in each

lane. For each station, a volume weighted mean speed for 1 minute aggregation was compared

to arithmetic averaging, and the results showed that the difference was not significant.

Therefore, mean speeds by station were calculated at further levels of aggregation (1 minute,

3 minutes, and 5 minutes) using arithmetic averages. So for each station, measurements from

multiple lanes were aggregated into a single quantity using simple arithmetic averaging for

simplified calculation. The obtained value is extrapolated across a region of influence of a

sensor station, i.e., the segment of the corridor starting with the sensor station location and

ending in the upstream sensor location.

Any data source will contain some errors. In this case, there is a large gap in the data at 3

consecutive detectors: milepost 292.18 has no data after 11:41 am, and the entire day’s data is

missing at mileposts 293.18 and 293.74. This constitutes 10,533 missing data points out of

our 96,876-point dataset. Since we impute the data for these mileposts by taking the average

of the data at the adjacent mileposts on either side, we can think of this as simply having one

less detector station prior to 11:41am and two fewer detectors afterwards. Thus, in practice we

are working with a 90, 555-point dataset. Besides this gap in the data, the rest of the data set is

missing only 4,277 points, or 4.72% of our data. Most of these remaining missing points are

isolated and randomly distributed across the day and corridor.

Some basic imputation was performed during the aggregations. When aggregating the 20-

second data into 1-minute data, all nonzero values for each minute (i.e., the nonzero values in

the set of three 20-second data points corresponding to each 1-minute period) were averaged.

Any further missing data elements were imputed by averaging the known values for the

detectors on either side of the data gap. After this step, there were no more missing values

when the 1-minute data was aggregated into the 3-, 5-, and 15-minute data sets. Incident

information provided by ODOT was also considered, and is reported in Table 1.

Table 1. Incidents for October 3, 2007. Data provided by ODOT.

Type Location Time

Stall I-5 Northbound Just North Of BURNSIDE 4:54:52

Stall I-5 Northbound Just South Of TAYLORS FERRY 8:29:45

Debris I-5 Northbound BOECKMAN 8:33:33

Debris I-5 Northbound Just South Of COLUMBIA BLVD 9:09:52

Stall I-5 Northbound Just South Of SANDY BLVD 10:20:13

Stall I-5 DELTA PARK 11:14:12

Other Closure I-5 Northbound At TRUCK LANE 13:12:47

Accident I-5 Northbound At BROADWAY 13:41:10

Construction I-5 Northbound At NYBERG RD - BONITA 19:53:11

Other Incident I-5 Northbound MARINE DR WEST 21:40:29

Suspicious about the perceived presence ofovernight bottleneck activation (midnight–4:00),

further analysis of speed and volume time series at Capital Hwy (milepost 295.18) was

conducted. Data from this location was clearly noisy overnight, in particular in the presence

of low volumes. Overnight reliability of measurements has proven challenging in the Portland

freeways, and ongoing work has exposed calibration issues with detectors (Cheevarunothai, et

al., 2006). Timeseries plots for speed and volume are shown in Figures 4 and 5.

Page 7: USING ARCHIVED DATA TO SYSTEMATICALLY IDENTIFY AND ...bertini.eng.usf.edu/papers/bertini_aatt.pdf · bottleneck identification and analysis capabilities. One of PORTAL’s main features

7

Figure 4. Speed time series for Capital Highway location. Notice the ―unnatural‖ variance in

the 00:00—04:00 period.

Figure 5. Volume time series for Capital Highway location. Low volumes overnight were

correlated with atypical speed measurements

Page 8: USING ARCHIVED DATA TO SYSTEMATICALLY IDENTIFY AND ...bertini.eng.usf.edu/papers/bertini_aatt.pdf · bottleneck identification and analysis capabilities. One of PORTAL’s main features

8

BASELINE BOTTLENECK ANALYSIS

In order to establish a baseline to act as ―ground truth‖ for this experiment, a detailed analysis

of bottleneck activations for October 3, 2007 was conducted. Figure 6 shows a timeseries

speed surface plot from PORTAL for this day. This location has two empirically observed

recurring bottlenecks. The first active bottleneck (A) occurs between the Macadam Ave. exit

and the Terwilliger Blvd. exit (Milepost 299.7) between 7:45 and 10:00 am. This location is

referred to as the ―Terwilliger Curves‖ often mentioned in radio traffic reports. The second

active bottleneck (C) is detected near Jantzen Beach (Milepost 307.9) between 2:30 and 7:30

pm. This afternoon bottleneck corresponds to commuters returning to Vancouver,

Washington. In addition, two other locations are also perceived to be bottlenecks when one

considers the speed map. One (D) is located at station 296.26 (Spring Garden) between

midnight and 4:00 am (due to data errors described above), and the other (B) is located at

station 290.54 (Lower Boones Ferry) between 7:15 and 9:15 am.

Figure 6. Baseline analysis of bottleneck locations and algorithm detection on Interstate 5

corridor on October 3, 2007

The activation and deactivation times of each bottleneck were analyzed using oblique curves

of cumulative vehicle arrival number versus time and cumulative occupancy (or speed) versus

time constructed from data measured at neighboring freeway loop detectors. Using procedures

described in Horowitz and Bertini (2007) and other sources (Cassidy and Windover, 1995;

Cassidy and Bertini, 1999a, 1999b; Bertini, 2003; Bertini and Myton, 2005), Figure 7

illustrates the partial diagnosis of bottleneck A for October 3, 2007.

In order to diagnose and verify the activation and deactivation times of bottleneck B, Figure 7

shows oblique curves of count and speed for the Terwilliger Blvd. station at milepost 299.7,

immediately upstream of bottleneck B. This was done for each station in the vicinity of

Terwilliger Blvd. Figure 5 shows the oblique count (left axis) and oblique speed (right axis)

curves for the Terwilliger Blvd. detector station. The cumulative plots are re-scaled in order to

A

B

C

Timeseries speed surface plot for I-5 NORTH on October 03, 2007 (Units in mph)

Bottleneck

Estimated

Propagation Speed

A – 20 mph

B – 17 mph

C – 15 mphD – 15 mph

Activation

Deactivation

Algorithm

detection

A

B

C

D

Page 9: USING ARCHIVED DATA TO SYSTEMATICALLY IDENTIFY AND ...bertini.eng.usf.edu/papers/bertini_aatt.pdf · bottleneck identification and analysis capabilities. One of PORTAL’s main features

9

magnify their features. The value of this visual tool is that the prevailing flows and speeds

measured at the particular station are seen as slopes of the oblique cumulative plot. This

method filters out the noise while not altering the times at which flow and speed changes

occurred. For the left axis of the graph representing cumulative count, a scaling factor q0 =

3,760 vph was used. For the right hand axis a scaling factor b0 = 52.2 mph was used. The two

vertical lines delineate the bottleneck’s active period, beginning at 7:45:20 and ending at

10:00:40. The straight lines superimposed on the curves designate piecewise linear

approximations to the average flow (across three lanes) and average speed. The times when

average flow and speed change slope are labeled, and average prevailing flows and speeds are

labeled in vph and mph respectively near the curve. It’s clear that the bottleneck’s activation

at 7:45:20 was marked by the passage of a queue past the Terwilliger Blvd. station. This is

seen via the drop in flow from 4,800 to 3,730 vph and a corresponding drop in speed. The

deactivation of the bottleneck later was marked by an increase in speed and a further drop in

flow after the morning peak period had ended and demand had dropped.

Figure 7. Terwilliger Blvd. oblique count and speed plot, October 3, 2007

The procedure described above and illustrated in Figure 7 was performed for all candidate

bottleneck activations, marked on Figure 6. Knowing the times at which a transition from

unqueued conditions to queued conditions occurred at a station also allowed for the

estimation of a shock speed between stations, and is also shown on Figure 6 for each

activation. The bottleneck activation and deactivation times are shown in Table 2, and can

now serve as the baseline ―ground truth‖ for experiments aimed at developing and testing an

automated bottleneck detection capability. Having this baseline should facilitate a sensitivity

analysis approach that will reveal the best combination of data aggregation level and speed

thresholds to be used within PORTAL’s bottleneck identification system.

Oblique Count and Speed for Terwilliger Blvd. MP 299.7 October 3, 2007

-500

-400

-300

-200

-100

0

100

200

300

400

6:0

0

6:1

5

6:3

0

6:4

5

7:0

0

7:1

5

7:3

0

7:4

5

8:0

0

8:1

5

8:3

0

8:4

5

9:0

0

9:1

5

9:3

0

9:4

5

10

:00

10

:15

10

:30

10

:45

11

:00

Time

Ob

liq

ue

Co

un

t (q

0 =

3,7

60

vp

h)

-100

0

100

200

300

400

500

600

700

Ob

liq

ue

Sp

ee

d (

b0 =

52

.2 m

ph

)

Count (vph)

Speed (mph)

6:4

2:2

0

7:2

8:2

0

7:4

5:2

0

8:2

3:4

0

10:0

0:4

0

10:3

6:4

0

6:2

3:0

0

10:2

7:4

0

9:2

2:0

0

9:0

3:4

0

8:5

5:0

0

8:2

6:2

0

7:0

5:0

0

6:4

2:0

0

52

Page 10: USING ARCHIVED DATA TO SYSTEMATICALLY IDENTIFY AND ...bertini.eng.usf.edu/papers/bertini_aatt.pdf · bottleneck identification and analysis capabilities. One of PORTAL’s main features

10

Table 2. Number of Speed Pairs Evaluated for Bottleneck Presence.

Bottleneck Milepost Activation Deactivation Sample Size

20sec 1 min 3 min 5 min 15 min

B 290.54 7:15 9:15 360 120 40 24 8

D 296.26 0:00 4:00 720 240 80 48 16

A 299.7 7:45 10:00 405 135 45 27 9

C 307.9 14:30 19:30 1,080 360 120 72 24

Total 2,565 855 285 171 57

EXPERIMENTAL SETUP

Now that bottlenecks have been analyzed as a baseline, in order to test the Chen et al.

algorithm on the Portland dataset from PORTAL, we implemented the core of the method

(identifying the locations of bottlenecks) using MATLAB. Chen’s method compares each pair

of longitudinally adjacent detectors at each 5-minute time point and declares that there is a

bottleneck between them when:

1. The speed at the upstream detector is below the maximum speed threshold , and

2. The difference in the speeds at the upstream and downstream detectors is above the speed

differential threshold .

Given that a bottleneck separates unqueued (downstream) from queued (upstream) traffic

states, Chen et al. used values of = 40 mph, = 20 mph, with data aggregated at 5 minute

intervals. That is to say, if the upstream speed is less than 40 mph and the downstream speed

is more than 20 mph greater than the upstream speed, the Chen et al. method identifies a

bottleneck between those detectors during that interval. The Chen et al. paper also specifies

that they used data aggregated over 5-minute intervals. However, Chen et al. recognized that

each of these three parameters (maximum speed, minimum differential, and aggregation level)

may need to be adjusted for the algorithm to work optimally in new settings. In this research,

we have tested different values of , , and the aggregation level as shown in Table 3.

Table 3. Input Variables Tested for Bottleneck Identification. Input Variables Values Tested

Max speed detected by upstream detector (mph) 30, 35, 40, 45, 50 mph

Speed differential (mph) 10, 15, 20, 25, 30 mph

Data aggregation level for measured speed data 20 sec, 1 min, 3 min, 5 min, 15 min

As shown in Table 2 above, the total number of speed-pairs where bottlenecks are activated

depends on the speed data aggregation level. Thus, the number of successful identifications

and false alarms is not comparable, so their proportions were used for evaluation instead. The

sensitivity analysis will include a comparison of success rate and false alarm rate, presented

as a ratio against the known bottleneck activation and deactivation times and locations.

The result of our tests is an assessment of how many bottleneck instances are captured by the

Chen et al. method with the different threshold combinations ( , ) and using different

aggregation levels for the measured speed data. As an example, Figure 8 shows the outcomes

for all 25 threshold combinations using 5 minute speed data. Each time-space plot shows a

combination of speed differential and maximum speed measured at the upstream detector,

labelled as ( , ).

The success rate and false alarm rate for the given thresholds are considered as a function of

the input values. The success rate is the proportion of speed-pairs which are real bottlenecks

Page 11: USING ARCHIVED DATA TO SYSTEMATICALLY IDENTIFY AND ...bertini.eng.usf.edu/papers/bertini_aatt.pdf · bottleneck identification and analysis capabilities. One of PORTAL’s main features

11

that become successfully captured by the Chen et al. method. These correctly-found

bottlenecks are labelled as magenta triangles in Figure 8. And the false alarm rate, that is, the

proportion of speed-pairs captured by the Chen et al. method even though they are not real

bottlenecks; these false alarms are labelled as dark triangles in Figure 8. From Figure 8, the

correct triangles and false ones are easy to recognize visually. The same procedure was

repeated for the five other aggregation levels shown in Table 3.

Figure 8. Outcome of bottleneck identification with different input value combinations ( , )

using 5 min speed

Figure 9 compares the percentage of successes (vertical axis) across different thresholds

(horizontal axes) and different speed data aggregation levels. The algorithm results are

compared with the benchmark ―known‖ activations described above. Typically, a speed

differential of 20 mph could be assumed to be a signal of bottleneck activation. However, on

the plot, it is clear that the success rate increases (regardless of data time aggregation level) as

the maximum upstream speed increases and the minimum speed differential decreases, i.e.,

when the standards for defining a bottleneck are loosened. The success rate is maximized

when the input value for maximum speed detected by the upstream detector is 50 mph, that of

the speed differential between a pair of upstream-downstream detectors is 10 mph, and 5 min

speed data is used. The Chen et al. method has the highest success rate with these input values

using 5 minute PORTAL speed data.

Page 12: USING ARCHIVED DATA TO SYSTEMATICALLY IDENTIFY AND ...bertini.eng.usf.edu/papers/bertini_aatt.pdf · bottleneck identification and analysis capabilities. One of PORTAL’s main features

12

Note that 5 minute data is not optimal across the board: at some thresholds it is surpassed by

15 minute data, while at others (especially when the minimum speed differential is high) the

finer aggregation levels such as 20-second and 1-minute data actually perform better.

Figure 9. The proportion of true bottleneck speed-pairs that are successfully captured by the

Chen et al. method

On the other hand, Figure 10 illustrates the false alarm rate (vertical axis) from the whole

picture. Again, regardless of data aggregation level, the percentage of all speed-pairs captured

by the Chen et al. method that turn out to be false alarms is also maximized at the 50 mph

maximum upstream speed and 10 mph minimum speed differential. Note that in this case, the

aggregation level has the same effect throughout: finer aggregation levels like the 20-second

data have the highest false alarm rates across the board, while the coarsest aggregations like

15-minute data have the lowest false alarm rates at each point.

Page 13: USING ARCHIVED DATA TO SYSTEMATICALLY IDENTIFY AND ...bertini.eng.usf.edu/papers/bertini_aatt.pdf · bottleneck identification and analysis capabilities. One of PORTAL’s main features

13

Figure 10. The proportion of speed-pairs captured by Chen’s method which are not real

bottlenecks

As illustrated in Figures 10 and 11, the 5 and 15 minute aggregation levels are able to produce

the highest success rates and lowest false alarm rates. Figure 12 provides a visual comparison

between the results with different speed data aggregation levels, shown on time-space

diagrams of the I-5 corridor. The selected ( , ) pairs are shown on the right side of the

figure. The 20 second aggregation level data shows an unacceptably-large number of false

alarm triangles, looking very messy. Meanwhile, the 15 min time interval data misses many

true bottlenecks at some threshold combinations.

Figure 12. Comparison of outcomes with different time interval speed data

Page 14: USING ARCHIVED DATA TO SYSTEMATICALLY IDENTIFY AND ...bertini.eng.usf.edu/papers/bertini_aatt.pdf · bottleneck identification and analysis capabilities. One of PORTAL’s main features

14

CONCLUSIONS

From Figure 11 we see that 5 and 15 minute aggregation levels produce the lowest false alarm

rates at each combination of thresholds. In other words, aggregation smoothed out some of the

noise in the data that would otherwise lead to false alarms. In Figure 10, we also note that

success rates over 80% are achievable only for minimum speed differentials of 15 mph or less

and maximum upstream speeds of 40 or more. The 5 and 15 minute aggregation levels have

the highest success rate throughout this region. Consequently, the 5 and 15 minute

aggregations appear to be ideal. Anything coarser than 15 minutes, however, would be

unlikely to reveal information that is useful for purposes of bottleneck identification, since it

is desired to be able to pinpoint the start and end of a bottleneck as precisely as possible. In

fact, when following the leading edge of a bottleneck to estimate the rate at which the

activation shock propagates upstream, finer data aggregation levels may be necessary in order

to estimate the propagation rate with enough precision. For this reason, the 5 minute

aggregation is a better choice than the 15 minute data.

It appears that the success rate does not increase much for differentials below 15 mph and

maximum speeds above 40 mph, so this combination of thresholds along with a 5 minute data

aggregation level appears to be an optimal choice of settings for this algorithm. Hence, the

Chen et al. original settings used for the San Diego data—20 mph differential, 40 mph

maximum upstream speed, and 5 minute aggregation—are indeed close to a reasonable

balance for Portland. Figure 12 illustrates this, showing that (20, 40) at 5 minutes finds most

of the real bottlenecks while having an acceptably low number of false alarms.

It is clear that each user’s optimal choice of thresholds and data aggregation level will depend

on how the user balances the need for a high success rate against the need for a low false

alarm rate. For commuters checking the traffic report before heading out to work, they may

prefer to have a high success rate (and thus avoid being stuck in traffic) even if it comes with

frequent false alarms. On the other hand, perhaps researchers or transportation operations staff

searching for the most important bottlenecks to repair may prefer to have a low false alarm

rate so that money is not wasted on fixing a non-existent problem.

FUTURE WORK

The results describe here are promising and are leading to additional research toward

automating the process of identifying bottlenecks on Portland freeways. One next step

consists of testing the use of other variables such as measured volume as a criterion for

sorting recurrent bottlenecks from false positives due to data errors. One further step includes

the desire to add the ability to track sustained bottlenecks and discard transient false positives

that are isolated (in the time dimension) from other bottlenecks. In this paper we have

presented measurements of false positive or false negative tracking, but have not concluded

anything regarding an optimal level of false positives or negatives. This may vary by user,

and could be variables chosen by a PORTAL user in an iterative way.

When implemented in PORTAL, historical data will be compiled to ―learn‖ where and when

recurrent bottlenecks occur. This will facilitate some simple forecasting procedures that can

be shown on the timeseries speed surface plots as the loop detector live feed proceeds. Finally,

travel reliability on a given corridor may be more important than travel time alone for

travelers, shippers, and transport managers. In addition to identifying recurrent bottlenecks

Page 15: USING ARCHIVED DATA TO SYSTEMATICALLY IDENTIFY AND ...bertini.eng.usf.edu/papers/bertini_aatt.pdf · bottleneck identification and analysis capabilities. One of PORTAL’s main features

15

using measures of delay, we plan to test several reliability measures including travel time,

95th percentile travel time, travel time index, buffer index planning time index, and

congestion frequency

ACKNOWLEDGEMENTS

The authors acknowledge the support of the Oregon Department of Transportation, the

Federal Highway Administration, TriMet and the City of Portland for their ongoing support

and guidance in the development of the Portland ADUS. This work is supported by the

National Science Foundation under Grant No. 0236567, and the Mexican Consejo Nacional

de Ciencia y Tecnología (CONACYT) under Scholarship 178258. The Portland metropolitan

area’s TransPort ITS committee has also encouraged development of this project.

REFERENCES

Banks, J.H. (1991). ―Two-capacity Phenomenon at Freeway Bottlenecks: a Basis for Ramp

Metering?‖ Transportation Research Record: Journal of the Transportation Research

Board, 1320, 83–90.

Bertini, R. (2003). ―Toward the Systematic Diagnosis of Freeway Bottleneck Activation‖,

Proceedings of the IEEE 6th International Conference on Intelligent Transportation

Systems. Shanghai, China.

Bertini, R.L. and Myton, A. (2005). ―Using PeMS Data to Empirically Diagnose Freeway

Bottleneck Locations in Orange County, California‖. Transportation Research Record:

Journal of the Transportation Research Board, 1925, 48-57.

Cassidy, M.J. and Bertini, R.L. (1999a). ―Some Traffic Features at Freeway Bottlenecks‖.

Transportation Research, Part B, 33(1), 25–42.

Cassidy, M. J. and Bertini, R. L. (1999b). Observations at a freeway bottleneck. Proc. of

Fourteenth International Symposium on Transportation and Traffic Theory, Jerusalem,

Israel, 107-124.

Cassidy, M. J. and Windover, J. R. (1995) ―Methodology for Assessing Dynamics of Freeway

Traffic Flow‖. Transportation Research Record: Journal of the Transportation Research

Board 1484, 73-79.

Cheevarunothai, P., Wang, Y. and Nihan, N. (2006). ―Identification and Correction of Dual-

Loop Sensitivity Problems‖. Transportation Research Record: Journal of the

Transportation Research Board, 1945, 73-81.

Chen, C., Skabardonis, A. and Varaiya, P. (2004). ―Systematic Identification of Freeway

Bottlenecks‖. Proceedings of 83rd Transportation Research Board Annual Meeting.

Washington, D.C., U.S.A.

Daganzo, C. F. (1997). Fundamentals of Transportation and Traffic Operations. Elsevier

Science, New York, U.S.A.

Das, S. and Levinson, D. (2004). ―A Queuing and Statistical Analysis of Freeway Bottleneck

Formation‖. ASCE Journal of Transportation Engineering, 130(6), 787-795.

Hall, F.L. and Agyemang-Duah. K. (1991). ―Freeway Capacity Drop and the Definition of

Dapacity‖. Transportation Research Record: Journal of the Transportation Research

Board, 1320, 91–98.

Horowitz, Z. and Bertini, R.L. (2007). ―Using PORTAL Data to Empirically Diagnose

Freeway Bottlenecks Located on Oregon Highway 217‖. Proceedings of the 86th Annual

Meeting of the Transportation Research Board. Washington, D.C., U.S.A.

Zhang, L. and Levinson, D. (2004). ―Ramp Metering and Freeway Bottleneck Capacity‖.

Transportation Research Part A (in press).