Interaction-Aware Tracking and Lane Change Detection in ...Interaction-Aware Tracking and Lane...

HAL Id: hal-01534094https://hal.inria.fr/hal-01534094

Submitted on 7 Jun 2017

HAL is a multi-disciplinary open accessarchive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come fromteaching and research institutions in France orabroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, estdestinée au dépôt et à la diffusion de documentsscientifiques de niveau recherche, publiés ou non,émanant des établissements d’enseignement et derecherche français ou étrangers, des laboratoirespublics ou privés.

Interaction-Aware Tracking and Lane Change Detectionin Highway Scenarios Using Realistic Driver ModelsDavid Sierra González, Víctor Romero-Cano, Jilles Dibangoye, Christian

Laugier

To cite this version:David Sierra González, Víctor Romero-Cano, Jilles Dibangoye, Christian Laugier. Interaction-AwareTracking and Lane Change Detection in Highway Scenarios Using Realistic Driver Models. ICRA2017 Workshop on Robotics and Vehicular Technologies for Self-driving cars, Jun 2017, Singapore,Singapore. �hal-01534094�

https://hal.inria.fr/hal-01534094

https://hal.archives-ouvertes.fr

Interaction-Aware Tracking and Lane Change Detection in HighwayScenarios Using Realistic Driver Models (Extended Abstract)

David Sierra Gonzalez, Vıctor Romero-Cano, Jilles S. Dibangoye, and Christian Laugier

Abstract— We address the problem of multi-vehicle trackingand motion prediction in highway scenarios using informationfrom sensors and perception systems widely used in automateddriving. In particular, we focus on the detection of lane changemaneuvers. Dangerous lane changing constitutes the main causeof highway accidents and a reliable detection system is stilllacking on modern cars. Our prediction approach is two-fold. First, a driver model learned from demonstrations viaInverse Reinforcement Learning is used to equip a host vehiclewith the anticipatory behavior reasoning capability of commondrivers. Second, inference on an interaction-aware augmentedSwitching State-Space Model allows the approach to account forbehaviors that deviate from those learned from demonstrations.In this paper, we show how to combine model-based behaviorprediction and filtering-based state and maneuver tracking inorder to detect lane changes in highway scenarios, and presentthe results obtained on real data gathered with an instrumentedvehicle.

I. INTRODUCTION

Interacting with human agents is one of the majorchallenges faced by Advanced Driver Assistance Systems(ADAS) and autonomous driving vehicles. In spite of thecomplexity of the problem, human drivers are extremelygood at predicting the intentions of surrounding vehicles.This is due to their innate ability to interpret the motioncues of other drivers and to reason over their most likelyrisk-aversive behavior.

In previous work, our efforts were focused on modelingthe risk-aversive behavior of drivers from driving demon-strations using Inverse Reinforcement Learning (IRL), andon using the resulting models to predict the developmentof highway traffic scenes [1]. The main drawback of thisapproach is that it fails to consider all the dynamic evidencethat might give away potentially dangerous maneuvers thatdeviate from the expected risk-aversive behavior of highwaydrivers.

In order to exploit the predictive potential of IRL drivermodels without disregarding any physical evidence, weintroduce a probabilistic framework that predicts the lanechange intentions of highway drivers by merging model andtracking-based maneuver inference.

II. RELATED WORK

Research in the fields of state estimation and scene predic-tion for Intelligent Vehicles has increased significantly in thepast decade. In the first place, we can identify tracking-based

This work has been supported by Toyota Motor Europe.The authors are with INRIA Grenoble Rhone-Alpes, 38330 Montbonnot-Saint-Martin, France - {david.sierra-gonzalez, victor.romero-cano,jilles.dibangoye, christian.laugier}@inria.fr

maneuver inference approaches, which are based on identify-ing the motion model that fits best the current dynamics of amaneuvering target from among a discrete set of models [2],[3]. These approaches are typically based on an InteractingMultiple-Model (IMM) filter, in which the switching betweendynamics is Markovian, causing the performance to dependheavily on a proper tuning of the regime transition matrix.Furthermore, they fail to consider the interactions betweenthe traffic participants in the switching process.

Dynamic Bayesian Networks (DBN) have been used toexplicitly consider the interactions between traffic partici-pants in the long-term prediction of traffic situations [4]. Thebehavior for each vehicle is inferred by identifying its currentsituation (e.g. close to the vehicle in front) as a functionof its local situational context (e.g. distance to the vehiclein front). This approach makes conceptually a lot of sensebut relies strongly on particle filtering, which may limit itsapplicability in real-world complex scenarios.

Another approach along the same lines is presented in [5].High-level discrete contexts determine the evolution of lowlevel dynamics. In contrast to the IMM approaches discussedabove, the switching process is only conditionally Markov,depending also on the continuous state at the precedingtime step through a linear feature-based function. The majordifference with [4] lies in the inference engine. Instead ofrelying on particle filtering, approximate inference is per-formed using a variation of the forward pass for augmentedSwitching Dynamical Systems (aSLDS) presented in [6]. Theapproach is evaluated on its ability to track interacting trucksin an opencast mine. The model describing the interactionsis relatively simple, considering only as factors the time-to-collision and the right-of-way.

The proposed approach builds upon [1] and [5]. Themodel-based prediction from [1] is integrated into a filteringframework in order to: 1) reason about the most likely high-level, interaction-aware behavior of drivers by consideringits near-future consequences; and 2) use the expected model-based behavior to reinforce and accelerate the detection ofchanges in the dynamics of the targets.

III. FRAMEWORKA. Driver modeling

In order to learn the feature-based cost function describingthe preferences of a driver from driving demonstrations, wefirst model the dynamics of the driver as a Markov DecisionProcess (MDP), then select a number of relevant featuresusing background knowledge, and finally use an IRL algo-rithm to learn the balance between the different terms of the

cost function according to the behavior demonstrated. Moredetails can be found in [1]. The resulting cost function is lin-ear on the features C(s) = θ ·f(s), where θ = (θ1, . . . , θK)is the weight vector and f(s) = (f1(s), . . . , fK(s)) is thefeature vector that parameterizes state s.

B. Framework overview

The probabilistic model proposed can be categorized as aSwitching State Space Model (SSSM), in which a high-levellayer reasons about the maneuvers being performed by thedifferent interacting vehicles and determines the evolutionof the low-level dynamics. Fig. 1 shows the graphical modelthat specifies the conditional independence assumptions ofour model. Bold arrows indicate multi-vehicle dependencies.Focusing on the slice of the graphical model for vehicle i,we can observe three layers of abstraction:• The highest level corresponds to the maneuver mi

t

being executed by the vehicle. This is a discrete hiddenrandom variable. In this work we consider two possiblemaneuvers: lane keeping (LK) and lane change (LC).

• The second level describes the state of the vehicle in acurved road frame through the continuous state vectorxit = [x, y, ψ, v, ω]T ∈ R5, where x and y are the target

coordinates, v is the vehicle’s absolute linear speedalong its direction of travel ψ, and ω is the yaw rate.

• Finally, the shaded nodes in the graphical model are theobservations.

The factorization of the joint distribution given the modelassumptions is the following:

P(x1:N1:T ,m

1:N1:T , z

1:N1:T

)= P

(x1:N1 ,m1:N

1 , z1:N1

)T∏

t=2

N∏i=1

[P (xit|mi

t−1:t,x1:Nt−1) P (m

it|m1:N

t−1,x1:Nt−1) P (z

it|xi

t)]

where the notation x1:N1:T is shorthand for the tuple

(x11, . . . ,x

1T , . . . ,x

N1 , . . . ,x

NT ), T indicates the number of

timesteps considered and N denotes the number of vehiclesinvolved.

The term P (xit|mi

t−1:t,x1:Nt−1) describes the dynamic evo-

lution of the state of a target given the distribution overmaneuvers at the current and previous timesteps, and the pre-vious states of all vehicles. The simplified kinematic bicyclemodel is used, with constant acceleration (CA) dynamics forthe LK maneuver, and constant turn-rate and accelerationdynamics (CTRA) for the lane change maneuver.

The maneuvering behavior of drivers is described in thepredictive term P (mi

t|m1:Nt−1,x

1:Nt−1). The probability of a

driver choosing a maneuver depends heavily on the stateof the other traffic participants. We take advantage of therisk-aversive IRL driver model to take into account theseinteractions and to forecast the most likely distribution overmaneuvers at each timestep. This process is detailed insubsection III-C.

Finally, the term P (zit|xit) is the measurement model and

relates the hidden states of the vehicles in the scene withthe observations through a linear measurement equation withadded Gaussian noise.

xit−1 xi

t xit+1

zit−1 zit zit+1

mit−1 mi

t mit+1

Fig. 1: Graphical model representation of the proposedswitching state-space model.

C. Interaction-Aware, Model-Based Maneuver Forecasting

By means of the risk-aversive driver model obtainedwith IRL, we can forecast the probability of each driver’snext maneuver in response to the states and maneuvers ofthe other traffic participants. The driver model used herebalances the (navigational and risk) preferences of driversand enables us to predict their anticipatory behavior. A driverwill perform a maneuver at the current timestep if, given hisprediction for the behavior of the other surrounding drivers,this leads to a sequence of F future states that agree withhis/her preferences (encoded in the driver model):

P (mit+1 =M |x1:N

t ,m1:Nt ) ∝

E(x1:Nt ,m−i

t )∼P (x1:Nt ,m1:N

t |z1:N1:t )

[1−

∑Fk=0 θ

T f(xit+k,M )∑

mit+1

∑Fk=0 θ

T f(xi

t+k,mit+1

)

]where we have overloaded the notation for the state toexplicitly indicate the maneuver being used to propagateit between timesteps, and the notation m−i indicates themaneuvers for all agents except agent i. The expectation istaken with respect to the posterior at the previous timestep.

D. Approximate Inference

Exact inference of the posterior P (xit,m

it|zi1:t) is in-

tractable in the aSLDS, scaling exponentially with time [5],[6]. The proposed approximate inference engine is similarto the filtering approach presented in [6], with an extensionto account for non-linear dynamics. Inference is performedindividually per agent. The key idea is to approximate theintractable posterior with a simpler distribution (a Gaussianmixture), and to establish a recursion to track it over time.The posterior can be decomposed as:

P (xit,m

it|zi1:t) = P (xi

t|mit, z

i1:t)P (m

it|zi1:t)

A recursion is established for each of the terms on ther.h.s. The first term is approximated with a Gaussian mixturedistribution with C components:

P (xit|mi

t, zi1:t) ≈

C∑ct=1

P (xit|ct,mi

t, zi1:t)P (ct|mi

t, zi1:t)

where the term P (xit|ct,mi

t, zi1:t) is a Gaussian and the

term P (ct|mit, z

i1:t) indicates the weight of the mixture

component. The first recursion is then established as:

P (xit+1|mi

t+1, zi1:t+1) =

∑mi

t,ct

P (xit+1,m

it, ct|mi

t+1, zi1:t+1)

=∑mi

t,ct

P (xit+1|ct,mi

t,mit+1, z

i1:t+1)P (m

it, ct|mi

t+1, zi1:t+1)

The term P (xit+1|ct,mi

t,mit+1, z

i1:t+1) is obtained by prop-

agating forward with all the available dynamics mit+1 each

component of the Gaussian mixture and by conditioningon the new observation zit+1. This leads to an exponentialincrease in the number of Gaussian components that iscollapsed back to C components per maneuver at the endof each inference step. The prediction and update steps areperformed using an Extended Kalman Filter (EKF). To obtainthe weights of the new mixture components we consider:

P (mit, ct|mi

t+1, zi1:t+1) ∝ P (zit+1|ct,mi

t,mit+1, z

i1:t)

P (mit+1|ct,mi

t, zi1:t)P (ct|mi

t, zi1:t)P (m

it|zi1:t)

The terms P (ct|mit, z

i1:t) and P (mi

t|zi1:t) are avail-able from the previous step in the recursion; the termP (zit+1|ct,mi

t,mit+1, z

i1:t) is the likelihood of the observa-

tion zit+1 under the respective Kalman prediction, and theprior P (mi

t+1|ct,mit, z

i1:t) is calculated by means of the

driver model as seen in subsection III-C. This is wherethe fusion between model and dynamics takes place. Forfurther details, including the second recursion not detailedhere, see [6]. This inference engine leads to a computationalcomplexity that grows linearly in the number of vehicles.

IV. EXPERIMENTAL EVALUATIONIn this section, we focus on the validation of our frame-

work’s ability to discriminate between lane change and lanekeeping maneuvers in highway scenarios. Furthermore, westudy how the model and dynamics-based predictions interactto determine the maneuver being executed by the target.

To validate our approach, we have gathered data on aFrench highway using an instrumented vehicle. By usinga grid-based target tracker and a lane tracker, we are ableto localize the targets with respect to the ego-vehicle (EV)and the road network. Fig. 2 shows one of the situationsencountered in which a fast-driving vehicle overtakes the EVand merges in front of it. The lateral position of the target inroad coordinates (the separating lane marking corresponds toy = 0) is shown in the first row of Fig. 2c. The longitudinalposition of the target with respect to the EV is shown in thesecond row. The third row shows the model-based predictionconsidering a prediction horizon of 3s and sampling intervalof 0.1s. It can be seen that the predicted probability of a LCmaneuver is low while the target is in parallel with the EV,and begins to increase as the target gets farther away in frontsince drivers typically drive on the right-most lane.

The fourth row shows the dynamics-based estimate, ob-tained using an uninformative model (a model that assignsthe same probability to all maneuvers). The last row showsthe estimate obtained by fusing dynamic evidence with themodel-based prediction, as explained in the previous section.The lane change maneuver is detected at around t = 10.0s,approximately 1.5s before the target crosses the lane markingand 0.2s earlier than with the dynamics-only estimate. Thedynamics-only estimate is prone to false-positive detections(see t = 2.5s) caused by the oscillations of drivers aroundthe lane center. The scene understanding brought into theestimate by the model-based prediction leads to a slower

(a) Front view at t = 4.0 s (b) Front view at t = 11.2 s

1.00.50.00.51.01.52.0

Lat.

pos.

100

1020304050

Rel

. lon

g. p

os.

0.00.20.40.60.81.0

Mod

el p

redi

ctio

n

LKLC

0.00.20.40.60.81.0

Dyn

am. o

nly

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Time

0.00.20.40.60.81.0

Dyn

am. +

mod

el

(c) Data from the target and maneuver probability estimates.

Fig. 2: Typical highway scenario: a driver overtakes the ego-vehicle and merges in front of it.

response to unlikely maneuvers, potentially reducing thenumber of false positives. We can thus conclude that includ-ing the model-based prediction into the filtering frameworkresults in faster detections and an increased robustness in themaneuver estimations. Finally, in cases where the dynamics-only estimate is confident enough and the model disagrees(see t = 6.0s to t = 9.5s), the final estimate is dominatedby the dynamics. This makes conceptually a lot of senseand constitutes a significant improvement over the behaviorof pure model-based prediction approaches [1].

This example highlights the potential of the presentedframework to identify lane change maneuvers in highwaysby combining dynamics and model-based inference.

REFERENCES

[1] D. Sierra Gonzalez, J. S. Dibangoye, and C. Laugier, “High-SpeedHighway Scene Prediction Based on Driver Models Learned FromDemonstrations,” in IEEE 19th International Conference on IntelligentTransportation Systems, Rio de Janeiro, Brazil, Nov. 2016.

[2] K. Weiss, N. Kaempchen, and A. Kirchner, “Multiple-model trackingfor the detection of lane change maneuvers,” in IEEE IntelligentVehicles Symposium, 2004, June 2004, pp. 937–942.

[3] R. Toledo-Moreo and M. Zamora-Izquierdo, “Imm-based lane-changeprediction in highways with low-cost gps/ins,” IEEE Transactions onIntelligent Transportation Systems, vol. 10, no. 1, pp. 180–185, 2009.

[4] T. Gindele, S. Brechtel, and R. Dillmann, “A probabilistic modelfor estimating driver behaviors and vehicle trajectories in traffic en-vironments,” in 13th International IEEE Conference on IntelligentTransportation Systems, Sept 2010, pp. 1625–1631.

[5] G. Agamennoni, J. I. Nieto, and E. M. Nebot, “Estimation of multive-hicle dynamics by considering contextual information,” IEEE Transac-tions on Robotics, vol. 28, no. 4, pp. 855–870, Aug 2012.

[6] D. Barber, “Expectation correction for smoothed inference in switchinglinear dynamical systems,” Journal of Machine Learning Research,vol. 7, pp. 2515–2540, 2006.

Interaction-Aware Tracking and Lane Change Detection in ...Interaction-Aware Tracking and Lane...

Documents

Transcript of Interaction-Aware Tracking and Lane Change Detection in ...Interaction-Aware Tracking and Lane...