Pursuit-Evasion Games with Multi-Pursuer: a decomposition ... · Pursuit-Evasion Games with...

Pursuit-Evasion Games with Multi-Pursuer:a decomposition approach.

Adriano Festa

(join work with Richard B. Vinter)

EEE Department, IC London.

27th November 2012

G. Castelnuovo, Sapienza Universita di Roma

Festa-Vinter Multi-Pursuer Differential Games

Outline of the talk

1 Starting example: Surge Tank Control

2 Pursuit-Evasion games

3 A decomposition technique vs high dimensionality

4 Numerical Tests

5 Concluding Remarks


The Surge Tank Control Problem

Falugi, Kountouriotis, Vinter. Differential Games Controllers ThatConfine a System to a Safe Region in the State Space, WithApplications to Surge Tank Control. IEEE Trans. Automat. Contr.57(11): 2778-2788 (2012)


The Surge Tank Control Problem

dx(t)dt

= f (x(t),a(t)) + σ(x(t))b(t)

=

[0 10 0

]x(t) +

[01

](−a(t) + b(t)) .

The constraints that the surge tank must neither overflow orempty are expressible (in normalized units) as

−1 < x1(t) < +1 .

so x ∈ Ω := (−1,1)× R.Permitted tolerances on the Max Rate of Change of Outflow(MROC) index are captured by the additional constraint on theoutflow:

−1 ≤ a(t) ≤ +1 .


Surge Tank as a Differential Game

A := {a(.) : [0,∞)→ R | a(t) ∈ [−1,1]} .B := {b(.) : [0,∞)→ R} .

The space Φ of closed loop controls for the a player is

Φ := {non-anticipative mappings φ(.) : B → A} .

The Differential game is: find

v(x) = supφ∈Φ

infb∈B

J (x , φ(b(.)),a(.))

where the payoff function is

J(x ,a,b) :=∫ τx

0

(12|b(t)|2 + θ

)dt .

with θ ≥ 0 (design parameter) and τx first exit time from Ω.


Safe Region for the Surge Tank


Associated Optimal Control Problems

The set Ω = {x | − 1 < h(x) < +1} can be represented

Ω = Ω1 ∩ Ω2where

Ω1 = {x |h(x) < +1}, Ω2 = {x | − 1 < h(x)}

For every x ∈ Ωj , j = 1,2, consider the optimal control problem

(P jx )

vj(x) = infb∈B

∫ τx0 (

12 |b(t)|

2 + θ)dtẏ(t) = f (y(t),aj) + σ(y(t))b(t) a.e. on [0, τx )y(0) = x , hj(y(τx )) = 0 .

aj is frozen at value of a, driving away from boundary of Ωj

v(x) = min{v1(x), v2(x)} ∀x ∈ Ω


Comparison with min-max methods

Figure: Value function using a minmax technique (left) anddecomposition procedure (right), θ = 10.


Pursuit-Evasion games

The dynamic system is modelled as{y ′(t) = −g(y(t))a(t) + h(y(t))b(t) + l(y(t))y(0) = x

(1)

where y(t) ∈ Ω ⊂ RN is the state, and a and b are the controls.

We assumeg : Ω→ RNh : Ω→ RN are continuous

A,B are compact metric spaces(2)

and, for some constant L,

|g(x)− g(y)|+ |h(x)− h(y)| ≤ L|x − y | ∀x , y ∈ RN . (3)


Pursuit-Evasion games

We take as admissible control

A := {a : [0,+∞[→ A measurable } (4)

B := {b : [0,+∞[→ B measurable } (5)

and we consider only a ∈ A, b ∈ B.We are given a closed set T ⊆ RN , and define

tx (a,b) :={

min{t : yx (t ; a,b) ∈ T }+∞ if yx (t ; a,b) /∈ T ∀t .

(6)

the first player ”a” wants to minimize the time of hitting, and thesecond player ”b” wants to maximize the same cost.


PE games

We re-normalize these costs by the nonlinear transformation

ψ(u) :={

1− e−u if u < +∞1 if u = +∞. (7)

and consider the discounted cost functional

J(x ,a,b) = ψ(tx (a,b)) =∫ tx

0e−sds. (8)


PE games

We need the notion of nonanticipating strategy; for the firstplayer is

Γ := {α : B → A : t > 0,b(s) = b̃(s) for all s ≤ timplies α[b](s) = α[b̃](s) for all s ≤ t} (9)

for the second player is defined in the analogous way is ∆.

The lower and the upper values for the game are

v(x) := supβ∈∆

infa∈A

J(x ,a, β[a]) = infα∈Γ

supb∈B

J(α[b],b) (10)

the fact that lower and upper value are coincident is due to thenature of the dynamics and the cost functional.


PE games

the value function v(x) is the viscosity solution of the followingHamilton-Jacobi-Isaacs equation{

v(x) + H(x ,Dv(x)) = 0 x ∈ Ω \ Tv(x) = 0 x ∈ ∂T (11)

where

H(x ,p) := maxa∈A

minb∈B{−(g(x)a− h(x)b + l(x)) · p} − 1

= minb∈B

maxa∈A{−(g(x)a− h(x)b + l(x)) · p} − 1. (12)


Why the value function?

Solving this equation, and therefore getting the value functionof the game, we can get the optimal behavior for every playerfrom the starting point x0 as

a(t) = S(yx0(t))S(z) ∈ argmaxa∈A minb∈B

{−(−g(x)a + h(x)b + l(x)) · Dv(x)}

(13)b(t) = W (yx0(t))W (z) ∈ argminb∈B maxa∈A

{−(−g(x)a + h(x)b + l(x)) · Dv(x)}.

(14)


Example 1

We consider the pursuit-evasion game with two pursuers p1,p2and one evader e where all the agents are free to move in the1D space with various velocities.

p′1 =23a1

p′2 = a2e′ = b2p1(0) = p01p2(0) = p02e(0) = e0

(15)

where a1,a2,b ∈ B(0,1) = [−1,1], p1,p1,e ∈ R.

Capture happens when mini∈{1,2} |pi − e| ≤ r with somer ≥ 0.We underline that this problem is in a space of dimensionthree.


Example 1- reduced dynamics

We get the reduced dynamicsy ′1 = −

23a1 +

b2

y ′2 = −a2 +b2

y1(0) = p01 − e0

y2(0) = p02 − e0

(16)

here we have a1,a2,b ∈ B(0,1), y1, y2 ∈ [0,+∞] and

T := {(y1, y2) ∈ R2 : mini∈{1,2}

|yi | ≤ r}. (17)

The HJI equation associated to the problem isv(x) + max

a1,a2∈Aminb∈B

{−(−23a1 +

b2 ,−a2 +

b2 ) · Dv(x)

}= 1

x ∈ [0,+∞]2 \ Tv(0) = 0 x ∈ ∂T

(18)Festa-Vinter Multi-Pursuer Differential Games

A decomposition technique vs hightdimensionality

I need some additional Hypotheses:

We assumeA = Bn(0, ρa1)× Bn(0, ρa2)× ...× Bn(0, ρaM ),B = [Bn(0, ρb)]m,g(x)ρa − h(x)ρb − |l(x)| > 0, ∀x ∈ Ω.

(19)

where with [Bn(0, ρb)]m we mean the space

{(b1,b2, ...,bn︸︷︷︸,b1, ...,bn, ...,b1, ...,bn)︸︷︷︸m times

∈ RN : |(b1, ...,bn)| = ρb}


Under these Hypotheses we can show

PropositionH(x ,p) is convex with respect to the variable p.

H(x ,p) = g(x) maxa∈A{a · p} − h(x) max

b∈B{−b · p} − l(x) · p − 1

Figure: in this case N = 2,m = 2,n = 1.


Decomposition

We consider the following class of problems, withi ∈ I := {1, ...m} ⊂ N,{

u(xi) + H (x ,Dui(x)) = 0 x ∈ Ω \ Tiui(x) = 0 x ∈ Ti

(20)

where T := ∪iTi .

Theorem

Let assume the standard Hypotheses and (19).Then we have that

u(x) := min{ui ; i ∈ I} (21)

is the unique value function of the P-E game.


Useful? Example 1

T := T1 ∪ T2 with Ti := {(x1, x2) ∈ R2 : |yi | ≤ r};

the second equation of the dynamics doesn’t affect thedecomposed problem.

v(x1, x2) + min

a1∈Amaxb∈B

{−(23a1 −

b2 ) ·

∂∂x1

v(x1, x2)}

= 1

x1 ∈ (r ,+∞]v(0) = 0 x2 ∈ [0, r ]∂∂x2

v(x1, x2) = 0(22)

this is rather easy to solve. We get

v(x1, x2) = re−6(x1−r)−e−6(x1−r)+1 and u(x1, x2) = −6(x − r)(1− r)(23)


Useful? Example 1

in the same way if we consider T2

v(x1, x2) = re−2(x1−r)−e−2(x1−r)+1 and u(x1, x2) = −2(x − r)(1− r).(24)

the Hypotheses of the Theorem are satisfied:g(x)ρa − h(x)ρb =

(23 −

12 ,1−

12

)= (16 ,

12) > 0

u(x1, x2) ={−6(x − r)(1− r) if x1 ≤ 13x2 +

23 r

−2(x2 − r)(1− r) if x1 > 13x2 +23 r .

(25)


Example 1 - value function

In this way, solving two simpler problems (of dimention 1).We get the solution of the original one.


Sketches of the proof I

DefinitionWe say that v : Ω→ R is semiconcave on the open convex setω if there exists a constant C ≥ 0 such that

λv(x) + (1− λ)v(y) ≤ v (λx + (1− λ)y) + 12

Cλ(1− λ)|x − y |2

(26)for all x , y ∈ Ω and λ ∈ [0,1].

We can show that in our case, every solution of a decomposedproblem is semicancave.Moreover we have the following property of s.c. functions. Said

D∗v(x) ={

p ∈ RN : p = limn→+∞

Dv(xn), xn → x}

we have D+v(x) = coD∗v(x).Festa-Vinter Multi-Pursuer Differential Games

Sketches of the proof II

We want to show that u = min{ui ; i ∈ I} is the viscosity solutionof the HJI equation associated to the PE game.

The minimum of a family of supersolution is alwayssupersolution.

We prove that u is subsolution, too.We know now that the propriety of semiconcavity is preservedtaking the minimum of a class of semiconcave functions, so

D+u(x) = coD∗u(x) ⊆ co {D∗ui(x)|i ∈ I}⊆ co {coD∗ui(x)|i ∈ I} = co

{D+ui(x)|i ∈ I

}(27)


Sketches of the proof III

This implies, that a p ∈ D+u(x) and Λ = {λi , i ∈ Is.t .∑

i λi = 1}

p = (λ1, λ2...)·(p1,p2, ...) =∑

i

λipi (λ1, λ2...) ∈ Λ,pi ∈ D+ui(x)

(28)we know that for every pi ∈ D+ui(x) ,H(x ,pi) ≤ 0 for everyi ∈ I.

H(x ,p) = H

(x ,∑

i

λipi

)≤∑

i

λiH(x ,pi) ≤ 0. (29)

This shows that u(x) is a subsolution and conclude the proof.


Decomposition technique for PE games

TheoremWe call I := {1,2, ...,m} and T := {yi ∈ Rn : min |yi | ≤ r}.Said vi : RN × I → R, xi = (xni , ..., xn(i+1)−1),

vi(xi) + max

ai∈Aminb∈B

{−fi(xi ,ai ,b) · ∂∂xi vi(xi)

}= 1

xi ∈ [0,+∞]n \ Tivi(xi) = 0 xi ∈ Ti∂∂x vi(x) = 0

(30)with fi(xi ,ai ,b) = −g(xi)ai(t) + h(xi)b(t) + l(xi).We have that the value function of the PE game is

u(x) := log(

1−mini∈I

vi(x)).


Tag-Chase: example 1

(case1 c = 0.95)


mov1.mpgMedia File (video/mpeg)


(case2 c = 0.95)




(case3 c = 0.95)



Remarks

The key idea to preserve convexity of the Hamiltonian is:

there is a “specular” behavior of the two players

the player who minimize “dominates” the other one

There is a generalization of this result, but until now, we needthe semiconcavity of the decomposed problems.(it’s a strong and inconvenient assumption)

Thank you.


Pursuit-Evasion Games with Multi-Pursuer: a decomposition ... · Pursuit-Evasion Games with...

Documents

Transcript of Pursuit-Evasion Games with Multi-Pursuer: a decomposition ... · Pursuit-Evasion Games with...